❓ Frequently Asked Questions (FAQ)¶

General Questions¶

What is SPEX?¶

SPEX is a comprehensive Python library for spatial omics analysis (proteomics and transcriptomics). It provides tools for image processing, cell segmentation, feature extraction, clustering, and spatial analysis of multi-channel fluorescence microscopy data.

What types of data does SPEX support?¶

SPEX supports: - Multi-channel TIFF images - Fluorescence microscopy data - Spatial omics datasets (proteomics and transcriptomics) - Single-cell imaging data - Any multi-dimensional image data with spatial information

Is SPEX free to use?¶

Yes, SPEX is open-source and free to use under the Apache License 2.0.

What are the system requirements?¶

Minimum Requirements: - Python 3.8+ - 8GB RAM - 4 CPU cores

Recommended Requirements: - Python 3.9+ - 16GB+ RAM - 8+ CPU cores - GPU support (optional, for acceleration)

Installation¶

How do I install SPEX?¶

pip install spex-tools

I'm getting installation errors. What should I do?¶

Common Solutions:

Update pip and setuptools:

pip install --upgrade pip setuptools wheel

Install in a virtual environment:

python -m venv spex_env
source spex_env/bin/activate  # On Windows: spex_env\Scripts\activate
pip install spex-tools

Install system dependencies (Ubuntu/Debian):

sudo apt-get update
sudo apt-get install libgl1-mesa-glx libglib2.0-0

For M1/M2 Macs:
```
conda install -c conda-forge spex-tools
```

How do I install optional dependencies?¶

# For GPU support
pip install spex-tools[gpu]

# For development
pip install spex-tools[dev]

# For all optional dependencies
pip install spex-tools[all]

Data Loading¶

What image formats are supported?¶

SPEX supports: - TIFF/TIF (recommended) - PNG - JPEG/JPG - BMP - Most formats supported by PIL/Pillow

How do I load my image data?¶

import spex as sp

# Load single image
Image, channel = sp.load_image('path/to/image.tiff')

# Load multiple images
images = []
for file in image_files:
    img, ch = sp.load_image(file)
    images.append(img)

My image has multiple channels. How do I handle them?¶

# Load multi-channel image
Image, channel = sp.load_image('multichannel.tiff')

print(f"Image shape: {Image.shape}")  # (height, width, channels)
print(f"Channels: {channel}")

# Access specific channel
channel_0 = Image[:, :, 0]

Use chunked loading:

# Load in chunks
Image, channel = sp.load_image('large_image.tiff', chunk_size=1000)

Reduce image resolution:

from PIL import Image
img = Image.open('large_image.tiff')
img_resized = img.resize((img.width//2, img.height//2))
img_resized.save('resized_image.tiff')

Use memory-efficient data types:

import numpy as np
Image = Image.astype(np.float32)  # Instead of float64

Image Preprocessing¶

What preprocessing steps should I apply?¶

Recommended pipeline:

# 1. Background subtraction
Image_bg = sp.background_subtract(Image, list(range(len(channel))))

# 2. Denoising
Image_denoised = sp.nlm_denoise(Image_bg, list(range(len(channel))))

# 3. Optional: Additional filtering
Image_filtered = sp.median_denoise(Image_denoised, list(range(len(channel))))

How do I choose the right preprocessing parameters?¶

Background Subtraction: - Use default parameters for most cases - Increase window_size for larger background variations - Decrease for fine detail preservation

Denoising: - nlm_denoise: Better for preserving edges, slower - median_denoise: Faster, good for salt-and-pepper noise

My images are too noisy. What should I do?¶

Increase denoising strength:

Image_denoised = sp.nlm_denoise(Image, list(range(len(channel))), h=0.1)

Apply multiple denoising steps:

Image_denoised1 = sp.nlm_denoise(Image, list(range(len(channel))))
Image_denoised2 = sp.median_denoise(Image_denoised1, list(range(len(channel))))

Check image acquisition settings:
Increase exposure time
Use higher quality cameras
Optimize illumination

Cell Segmentation¶

Which segmentation method should I use?¶

Cellpose (Recommended): - Best for most cell types - Automatic parameter estimation - Good for irregular shapes

StarDist: - Good for star-shaped objects - Faster than Cellpose - Requires more parameter tuning

Watershed: - Good for round, regular cells - Fastest method - Requires good preprocessing

How do I set the cell diameter parameter?¶

# Automatic estimation (recommended)
labels = sp.cellpose_cellseg(Image, [0])

# Manual setting
labels = sp.cellpose_cellseg(Image, [0], diameter=30)

# Estimate from image
from skimage import measure
props = measure.regionprops(labels)
diameters = [prop.equivalent_diameter for prop in props]
print(f"Average cell diameter: {np.mean(diameters):.1f} pixels")

My segmentation is missing cells. What can I do?¶

Adjust diameter parameter:

# Try smaller diameter
labels = sp.cellpose_cellseg(Image, [0], diameter=20)

Use cell rescue:

labels_rescued = sp.rescue_cells(Image, labels, [0])

Improve preprocessing:

# Better contrast enhancement
Image_enhanced = sp.background_subtract(Image, [0], window_size=50)

How do I remove false positive cells?¶

# Remove small objects
labels_clean = sp.remove_small_objects(labels, min_size=50)

# Remove large objects
labels_clean = sp.remove_large_objects(labels_clean, max_size=1000)

# Combine both
labels_clean = sp.remove_small_objects(labels, min_size=50)
labels_clean = sp.remove_large_objects(labels_clean, max_size=1000)

Feature Extraction¶

What features are extracted?¶

SPEX extracts: - Morphological features: Area, perimeter, eccentricity, etc. - Intensity features: Mean, std, min, max, etc. - Texture features: Haralick features, GLCM features - Spatial features: Centroid, bounding box, etc.

How do I extract features for specific channels?¶

# Extract features for all channels
features = sp.feature_extraction(Image, labels, list(range(len(channel))))

# Extract features for specific channels
features = sp.feature_extraction(Image, labels, [0, 2])  # Channels 0 and 2

# Extract features for single channel
features = sp.feature_extraction(Image, labels, [0])

How do I handle missing values in features?¶

# Remove cells with missing values
features_clean = features.dropna()

# Fill missing values
features_filled = features.fillna(features.mean())

# Interpolate missing values
features_interpolated = features.interpolate(method='linear')

Can I extract custom features?¶

# Extract basic features first
features = sp.feature_extraction(Image, labels, [0])

# Add custom features
features['custom_ratio'] = features['area'] / features['perimeter']
features['intensity_density'] = features['mean_intensity'] / features['area']

Clustering¶

Which clustering method should I use?¶

Leiden (Recommended): - Best for most datasets - Handles large datasets well - Good cluster quality

Louvain: - Similar to Leiden - Slightly faster - May produce fewer clusters

Phenograph: - Good for complex datasets - Automatic parameter estimation - May be slower

How do I choose the resolution parameter?¶

# Try multiple resolutions
resolutions = [0.1, 0.3, 0.5, 0.7, 1.0]
results = {}

for res in resolutions:
    clusters = sp.cluster(adata, method='leiden', resolution=res)
    n_clusters = len(np.unique(clusters))
    results[res] = n_clusters
    print(f"Resolution {res}: {n_clusters} clusters")

# Choose based on expected number of cell types

How do I validate clustering results?¶

from sklearn.metrics import silhouette_score

# Calculate silhouette score
silhouette_avg = silhouette_score(features, clusters)
print(f"Silhouette score: {silhouette_avg:.3f}")

# Visualize clusters
import matplotlib.pyplot as plt
plt.scatter(features['umap_1'], features['umap_2'], c=clusters, cmap='tab10')
plt.colorbar()
plt.show()

My clusters don't make biological sense. What should I do?¶

Check feature quality:

# Remove low-quality features
feature_variance = features.var()
good_features = feature_variance[feature_variance > 0.01].index
features_filtered = features[good_features]

Try different preprocessing:

# Normalize features
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
features_normalized = scaler.fit_transform(features)

Use different clustering method:

clusters = sp.phenograph_cluster(features, k=30)

Spatial Analysis¶

What spatial analysis methods are available?¶

SPEX provides: - CLQ (Co-Localization Quotient): Measures spatial co-occurrence - Niche Analysis: Identifies spatial niches - Spatial Autocorrelation: Moran's I, Geary's C - Differential Expression: Spatial-aware DE analysis

How do I calculate spatial relationships between cell types?¶

# Calculate CLQ between cell types
clq_matrix = sp.CLQ_vec_numba(labels, cluster_labels, 
                             cell_types=[0, 1, 2], 
                             radius=50)

# Visualize CLQ matrix
import seaborn as sns
sns.heatmap(clq_matrix, annot=True, cmap='RdBu_r', center=0)
plt.show()

How do I identify spatial niches?¶

# Perform niche analysis
niche_results = sp.niche(labels, cluster_labels, 
                        radius=100, 
                        min_cells=10)

# Visualize niches
niche_results.plot_niches()

How do I interpret spatial autocorrelation results?¶

# Calculate Moran's I
moran_i = sp.spatial_autocorrelation(features, labels, method='moran')

# Interpretation:
# Moran's I > 0: Positive spatial autocorrelation (clustering)
# Moran's I < 0: Negative spatial autocorrelation (dispersion)
# Moran's I ≈ 0: Random spatial distribution

Performance and Optimization¶

My analysis is running slowly. How can I speed it up?¶

Use parallel processing:

# Set number of jobs
clusters = sp.cluster(adata, method='leiden', n_jobs=4)

Reduce image size:

# Resize image before processing
from skimage.transform import resize
Image_small = resize(Image, (Image.shape[0]//2, Image.shape[1]//2))

Use chunked processing:

# Process large images in chunks
labels = sp.cellpose_cellseg(Image, [0], chunk_size=512)

How do I handle large datasets that don't fit in memory?¶

Use Dask for out-of-memory processing:

import dask.array as da
Image_dask = da.from_array(Image, chunks=(1000, 1000, -1))

Process in batches:

# Process multiple files in batches
for batch in file_batches:
    process_batch(batch)
    gc.collect()  # Clear memory

Use memory-efficient data types:

Image = Image.astype(np.float32)  # Instead of float64

How do I monitor memory usage?¶

import psutil
import os

def get_memory_usage():
    process = psutil.Process(os.getpid())
    return process.memory_info().rss / 1024 / 1024  # MB

print(f"Memory usage: {get_memory_usage():.1f} MB")

Troubleshooting¶

I'm getting "CUDA out of memory" errors¶

Solutions: 1. Reduce batch size:

labels = sp.cellpose_cellseg(Image, [0], batch_size=1)

Use CPU instead of GPU:

labels = sp.cellpose_cellseg(Image, [0], use_gpu=False)

Process smaller image chunks:

labels = sp.cellpose_cellseg(Image, [0], chunk_size=256)

My segmentation is producing too many/few cells¶

Too many cells:

# Increase minimum cell size
labels = sp.remove_small_objects(labels, min_size=100)

# Increase diameter parameter
labels = sp.cellpose_cellseg(Image, [0], diameter=40)

Too few cells:

# Decrease diameter parameter
labels = sp.cellpose_cellseg(Image, [0], diameter=20)

# Use cell rescue
labels = sp.rescue_cells(Image, labels, [0])

I'm getting "No module named 'spex'" error¶

Solutions: 1. Check installation:

pip list | grep spex

Reinstall SPEX:

pip uninstall spex-tools
pip install spex-tools

Check Python environment:
```
which python
pip --version
```

My clustering is producing only one cluster¶

Solutions: 1. Check feature variance:

feature_variance = features.var()
print(f"Features with zero variance: {(feature_variance == 0).sum()}")

Increase resolution:

clusters = sp.cluster(adata, method='leiden', resolution=1.0)

Try different clustering method:

clusters = sp.phenograph_cluster(features, k=30)

How do I save and load my analysis results?¶

Save results:

import pickle

# Save segmentation
with open('segmentation.pkl', 'wb') as f:
    pickle.dump(labels, f)

# Save features
features.to_csv('features.csv')

# Save clustering
with open('clustering.pkl', 'wb') as f:
    pickle.dump(clusters, f)

Load results:

# Load segmentation
with open('segmentation.pkl', 'rb') as f:
    labels = pickle.load(f)

# Load features
features = pd.read_csv('features.csv', index_col=0)

# Load clustering
with open('clustering.pkl', 'rb') as f:
    clusters = pickle.load(f)

Best Practices¶

What is the recommended workflow?¶

Data Loading and Preprocessing:

Image, channel = sp.load_image('data.tiff')
Image_bg = sp.background_subtract(Image, list(range(len(channel))))
Image_denoised = sp.nlm_denoise(Image_bg, list(range(len(channel))))

Cell Segmentation:

labels = sp.cellpose_cellseg(Image_denoised, [0])
labels_clean = sp.remove_small_objects(labels, min_size=50)
labels_clean = sp.remove_large_objects(labels_clean, max_size=1000)

Feature Extraction:

features = sp.feature_extraction(Image_denoised, labels_clean, list(range(len(channel))))

Clustering:

adata = sp.feature_extraction_adata(Image_denoised, labels_clean, list(range(len(channel))))
clusters = sp.cluster(adata, method='leiden', resolution=0.5)

Spatial Analysis:

clq_matrix = sp.CLQ_vec_numba(labels_clean, clusters, cell_types=[0, 1, 2])

How do I ensure reproducible results?¶

import numpy as np
import random

# Set random seeds
np.random.seed(42)
random.seed(42)

# Use deterministic algorithms
clusters = sp.cluster(adata, method='leiden', resolution=0.5, random_state=42)

How do I validate my analysis pipeline?¶

Use known datasets:
Test with published datasets
Compare results with known ground truth

Cross-validation:

# Split data and compare results
from sklearn.model_selection import train_test_split
features_train, features_test = train_test_split(features, test_size=0.3)

Parameter sensitivity analysis:

# Test different parameters
resolutions = [0.1, 0.3, 0.5, 0.7, 1.0]
for res in resolutions:
    clusters = sp.cluster(adata, method='leiden', resolution=res)
    # Evaluate clustering quality

Getting Help¶

Where can I find more documentation?¶

Official Documentation: SPEX Documentation
API Reference: Complete function documentation
Tutorials: Step-by-step guides
Examples: Practical code examples

How do I report bugs or request features?¶

GitHub Issues: Create an issue on the SPEX GitHub repository
Email: Contact the development team
Documentation: Check existing issues and documentation

How do I contribute to SPEX?¶

Fork the repository
Create a feature branch
Make your changes
Add tests
Submit a pull request

Where can I find example datasets?¶

SPEX Examples: Included with the library
Public Repositories:
10X Genomics datasets
Human Cell Atlas
Allen Brain Atlas
Synthetic Data: Generate using SPEX simulation functions

Need more help? Check the troubleshooting section or contact the SPEX development team.