Skip to content

❓ Frequently Asked Questions (FAQ)

General Questions

What is SPEX?

SPEX is a comprehensive Python library for spatial transcriptomics analysis. It provides tools for image processing, cell segmentation, feature extraction, clustering, and spatial analysis of multi-channel fluorescence microscopy data.

What types of data does SPEX support?

SPEX supports: - Multi-channel TIFF images - Fluorescence microscopy data - Spatial transcriptomics datasets - Single-cell imaging data - Any multi-dimensional image data with spatial information

Is SPEX free to use?

Yes, SPEX is open-source and free to use under the Apache License 2.0.

What are the system requirements?

Minimum Requirements: - Python 3.8+ - 8GB RAM - 4 CPU cores

Recommended Requirements: - Python 3.9+ - 16GB+ RAM - 8+ CPU cores - GPU support (optional, for acceleration)

Installation

How do I install SPEX?

pip install spex-tools

I'm getting installation errors. What should I do?

Common Solutions:

  1. Update pip and setuptools:

    pip install --upgrade pip setuptools wheel

  2. Install in a virtual environment:

    python -m venv spex_env
    source spex_env/bin/activate  # On Windows: spex_env\Scripts\activate
    pip install spex-tools

  3. Install system dependencies (Ubuntu/Debian):

    sudo apt-get update
    sudo apt-get install libgl1-mesa-glx libglib2.0-0

  4. For M1/M2 Macs:

    conda install -c conda-forge spex-tools

How do I install optional dependencies?

# For GPU support
pip install spex-tools[gpu]

# For development
pip install spex-tools[dev]

# For all optional dependencies
pip install spex-tools[all]

Data Loading

What image formats are supported?

SPEX supports: - TIFF/TIF (recommended) - PNG - JPEG/JPG - BMP - Most formats supported by PIL/Pillow

How do I load my image data?

import spex as sp

# Load single image
Image, channel = sp.load_image('path/to/image.tiff')

# Load multiple images
images = []
for file in image_files:
    img, ch = sp.load_image(file)
    images.append(img)

My image has multiple channels. How do I handle them?

# Load multi-channel image
Image, channel = sp.load_image('multichannel.tiff')

print(f"Image shape: {Image.shape}")  # (height, width, channels)
print(f"Channels: {channel}")

# Access specific channel
channel_0 = Image[:, :, 0]

I'm getting memory errors when loading large images. What can I do?

  1. Use chunked loading:

    # Load in chunks
    Image, channel = sp.load_image('large_image.tiff', chunk_size=1000)

  2. Reduce image resolution:

    from PIL import Image
    img = Image.open('large_image.tiff')
    img_resized = img.resize((img.width//2, img.height//2))
    img_resized.save('resized_image.tiff')

  3. Use memory-efficient data types:

    import numpy as np
    Image = Image.astype(np.float32)  # Instead of float64

Image Preprocessing

What preprocessing steps should I apply?

Recommended pipeline:

# 1. Background subtraction
Image_bg = sp.background_subtract(Image, list(range(len(channel))))

# 2. Denoising
Image_denoised = sp.nlm_denoise(Image_bg, list(range(len(channel))))

# 3. Optional: Additional filtering
Image_filtered = sp.median_denoise(Image_denoised, list(range(len(channel))))

How do I choose the right preprocessing parameters?

Background Subtraction: - Use default parameters for most cases - Increase window_size for larger background variations - Decrease for fine detail preservation

Denoising: - nlm_denoise: Better for preserving edges, slower - median_denoise: Faster, good for salt-and-pepper noise

My images are too noisy. What should I do?

  1. Increase denoising strength:

    Image_denoised = sp.nlm_denoise(Image, list(range(len(channel))), h=0.1)

  2. Apply multiple denoising steps:

    Image_denoised1 = sp.nlm_denoise(Image, list(range(len(channel))))
    Image_denoised2 = sp.median_denoise(Image_denoised1, list(range(len(channel))))

  3. Check image acquisition settings:

  4. Increase exposure time
  5. Use higher quality cameras
  6. Optimize illumination

Cell Segmentation

Which segmentation method should I use?

Cellpose (Recommended): - Best for most cell types - Automatic parameter estimation - Good for irregular shapes

StarDist: - Good for star-shaped objects - Faster than Cellpose - Requires more parameter tuning

Watershed: - Good for round, regular cells - Fastest method - Requires good preprocessing

How do I set the cell diameter parameter?

# Automatic estimation (recommended)
labels = sp.cellpose_cellseg(Image, [0])

# Manual setting
labels = sp.cellpose_cellseg(Image, [0], diameter=30)

# Estimate from image
from skimage import measure
props = measure.regionprops(labels)
diameters = [prop.equivalent_diameter for prop in props]
print(f"Average cell diameter: {np.mean(diameters):.1f} pixels")

My segmentation is missing cells. What can I do?

  1. Adjust diameter parameter:

    # Try smaller diameter
    labels = sp.cellpose_cellseg(Image, [0], diameter=20)

  2. Use cell rescue:

    labels_rescued = sp.rescue_cells(Image, labels, [0])

  3. Improve preprocessing:

    # Better contrast enhancement
    Image_enhanced = sp.background_subtract(Image, [0], window_size=50)

How do I remove false positive cells?

# Remove small objects
labels_clean = sp.remove_small_objects(labels, min_size=50)

# Remove large objects
labels_clean = sp.remove_large_objects(labels_clean, max_size=1000)

# Combine both
labels_clean = sp.remove_small_objects(labels, min_size=50)
labels_clean = sp.remove_large_objects(labels_clean, max_size=1000)

Feature Extraction

What features are extracted?

SPEX extracts: - Morphological features: Area, perimeter, eccentricity, etc. - Intensity features: Mean, std, min, max, etc. - Texture features: Haralick features, GLCM features - Spatial features: Centroid, bounding box, etc.

How do I extract features for specific channels?

# Extract features for all channels
features = sp.feature_extraction(Image, labels, list(range(len(channel))))

# Extract features for specific channels
features = sp.feature_extraction(Image, labels, [0, 2])  # Channels 0 and 2

# Extract features for single channel
features = sp.feature_extraction(Image, labels, [0])

How do I handle missing values in features?

# Remove cells with missing values
features_clean = features.dropna()

# Fill missing values
features_filled = features.fillna(features.mean())

# Interpolate missing values
features_interpolated = features.interpolate(method='linear')

Can I extract custom features?

# Extract basic features first
features = sp.feature_extraction(Image, labels, [0])

# Add custom features
features['custom_ratio'] = features['area'] / features['perimeter']
features['intensity_density'] = features['mean_intensity'] / features['area']

Clustering

Which clustering method should I use?

Leiden (Recommended): - Best for most datasets - Handles large datasets well - Good cluster quality

Louvain: - Similar to Leiden - Slightly faster - May produce fewer clusters

Phenograph: - Good for complex datasets - Automatic parameter estimation - May be slower

How do I choose the resolution parameter?

# Try multiple resolutions
resolutions = [0.1, 0.3, 0.5, 0.7, 1.0]
results = {}

for res in resolutions:
    clusters = sp.cluster(adata, method='leiden', resolution=res)
    n_clusters = len(np.unique(clusters))
    results[res] = n_clusters
    print(f"Resolution {res}: {n_clusters} clusters")

# Choose based on expected number of cell types

How do I validate clustering results?

from sklearn.metrics import silhouette_score

# Calculate silhouette score
silhouette_avg = silhouette_score(features, clusters)
print(f"Silhouette score: {silhouette_avg:.3f}")

# Visualize clusters
import matplotlib.pyplot as plt
plt.scatter(features['umap_1'], features['umap_2'], c=clusters, cmap='tab10')
plt.colorbar()
plt.show()

My clusters don't make biological sense. What should I do?

  1. Check feature quality:

    # Remove low-quality features
    feature_variance = features.var()
    good_features = feature_variance[feature_variance > 0.01].index
    features_filtered = features[good_features]

  2. Try different preprocessing:

    # Normalize features
    from sklearn.preprocessing import StandardScaler
    scaler = StandardScaler()
    features_normalized = scaler.fit_transform(features)

  3. Use different clustering method:

    clusters = sp.phenograph_cluster(features, k=30)

Spatial Analysis

What spatial analysis methods are available?

SPEX provides: - CLQ (Co-Localization Quotient): Measures spatial co-occurrence - Niche Analysis: Identifies spatial niches - Spatial Autocorrelation: Moran's I, Geary's C - Differential Expression: Spatial-aware DE analysis

How do I calculate spatial relationships between cell types?

# Calculate CLQ between cell types
clq_matrix = sp.CLQ_vec_numba(labels, cluster_labels, 
                             cell_types=[0, 1, 2], 
                             radius=50)

# Visualize CLQ matrix
import seaborn as sns
sns.heatmap(clq_matrix, annot=True, cmap='RdBu_r', center=0)
plt.show()

How do I identify spatial niches?

# Perform niche analysis
niche_results = sp.niche(labels, cluster_labels, 
                        radius=100, 
                        min_cells=10)

# Visualize niches
niche_results.plot_niches()

How do I interpret spatial autocorrelation results?

# Calculate Moran's I
moran_i = sp.spatial_autocorrelation(features, labels, method='moran')

# Interpretation:
# Moran's I > 0: Positive spatial autocorrelation (clustering)
# Moran's I < 0: Negative spatial autocorrelation (dispersion)
# Moran's I ≈ 0: Random spatial distribution

Performance and Optimization

My analysis is running slowly. How can I speed it up?

  1. Use parallel processing:

    # Set number of jobs
    clusters = sp.cluster(adata, method='leiden', n_jobs=4)

  2. Reduce image size:

    # Resize image before processing
    from skimage.transform import resize
    Image_small = resize(Image, (Image.shape[0]//2, Image.shape[1]//2))

  3. Use chunked processing:

    # Process large images in chunks
    labels = sp.cellpose_cellseg(Image, [0], chunk_size=512)

How do I handle large datasets that don't fit in memory?

  1. Use Dask for out-of-memory processing:

    import dask.array as da
    Image_dask = da.from_array(Image, chunks=(1000, 1000, -1))

  2. Process in batches:

    # Process multiple files in batches
    for batch in file_batches:
        process_batch(batch)
        gc.collect()  # Clear memory

  3. Use memory-efficient data types:

    Image = Image.astype(np.float32)  # Instead of float64

How do I monitor memory usage?

import psutil
import os

def get_memory_usage():
    process = psutil.Process(os.getpid())
    return process.memory_info().rss / 1024 / 1024  # MB

print(f"Memory usage: {get_memory_usage():.1f} MB")

Troubleshooting

I'm getting "CUDA out of memory" errors

Solutions: 1. Reduce batch size:

labels = sp.cellpose_cellseg(Image, [0], batch_size=1)

  1. Use CPU instead of GPU:

    labels = sp.cellpose_cellseg(Image, [0], use_gpu=False)

  2. Process smaller image chunks:

    labels = sp.cellpose_cellseg(Image, [0], chunk_size=256)

My segmentation is producing too many/few cells

Too many cells:

# Increase minimum cell size
labels = sp.remove_small_objects(labels, min_size=100)

# Increase diameter parameter
labels = sp.cellpose_cellseg(Image, [0], diameter=40)

Too few cells:

# Decrease diameter parameter
labels = sp.cellpose_cellseg(Image, [0], diameter=20)

# Use cell rescue
labels = sp.rescue_cells(Image, labels, [0])

I'm getting "No module named 'spex'" error

Solutions: 1. Check installation:

pip list | grep spex

  1. Reinstall SPEX:

    pip uninstall spex-tools
    pip install spex-tools

  2. Check Python environment:

    which python
    pip --version

My clustering is producing only one cluster

Solutions: 1. Check feature variance:

feature_variance = features.var()
print(f"Features with zero variance: {(feature_variance == 0).sum()}")

  1. Increase resolution:

    clusters = sp.cluster(adata, method='leiden', resolution=1.0)

  2. Try different clustering method:

    clusters = sp.phenograph_cluster(features, k=30)

How do I save and load my analysis results?

Save results:

import pickle

# Save segmentation
with open('segmentation.pkl', 'wb') as f:
    pickle.dump(labels, f)

# Save features
features.to_csv('features.csv')

# Save clustering
with open('clustering.pkl', 'wb') as f:
    pickle.dump(clusters, f)

Load results:

# Load segmentation
with open('segmentation.pkl', 'rb') as f:
    labels = pickle.load(f)

# Load features
features = pd.read_csv('features.csv', index_col=0)

# Load clustering
with open('clustering.pkl', 'rb') as f:
    clusters = pickle.load(f)

Best Practices

  1. Data Loading and Preprocessing:

    Image, channel = sp.load_image('data.tiff')
    Image_bg = sp.background_subtract(Image, list(range(len(channel))))
    Image_denoised = sp.nlm_denoise(Image_bg, list(range(len(channel))))

  2. Cell Segmentation:

    labels = sp.cellpose_cellseg(Image_denoised, [0])
    labels_clean = sp.remove_small_objects(labels, min_size=50)
    labels_clean = sp.remove_large_objects(labels_clean, max_size=1000)

  3. Feature Extraction:

    features = sp.feature_extraction(Image_denoised, labels_clean, list(range(len(channel))))

  4. Clustering:

    adata = sp.feature_extraction_adata(Image_denoised, labels_clean, list(range(len(channel))))
    clusters = sp.cluster(adata, method='leiden', resolution=0.5)

  5. Spatial Analysis:

    clq_matrix = sp.CLQ_vec_numba(labels_clean, clusters, cell_types=[0, 1, 2])

How do I ensure reproducible results?

import numpy as np
import random

# Set random seeds
np.random.seed(42)
random.seed(42)

# Use deterministic algorithms
clusters = sp.cluster(adata, method='leiden', resolution=0.5, random_state=42)

How do I validate my analysis pipeline?

  1. Use known datasets:
  2. Test with published datasets
  3. Compare results with known ground truth

  4. Cross-validation:

    # Split data and compare results
    from sklearn.model_selection import train_test_split
    features_train, features_test = train_test_split(features, test_size=0.3)

  5. Parameter sensitivity analysis:

    # Test different parameters
    resolutions = [0.1, 0.3, 0.5, 0.7, 1.0]
    for res in resolutions:
        clusters = sp.cluster(adata, method='leiden', resolution=res)
        # Evaluate clustering quality

Getting Help

Where can I find more documentation?

  • Official Documentation: SPEX Documentation
  • API Reference: Complete function documentation
  • Tutorials: Step-by-step guides
  • Examples: Practical code examples

How do I report bugs or request features?

  1. GitHub Issues: Create an issue on the SPEX GitHub repository
  2. Email: Contact the development team
  3. Documentation: Check existing issues and documentation

How do I contribute to SPEX?

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

Where can I find example datasets?

  • SPEX Examples: Included with the library
  • Public Repositories:
  • 10X Genomics datasets
  • Human Cell Atlas
  • Allen Brain Atlas
  • Synthetic Data: Generate using SPEX simulation functions

Need more help? Check the troubleshooting section or contact the SPEX development team.