# GeoSAM Image Encoder Package ```{admonition} Superseded by the geosam library :class: warning The standalone `GeoSAM-Image-Encoder` Python package has been **superseded** by the [`geosam`](https://github.com/Fanchengyan/geosam) core library since Geo-SAM v2.0. **Use the `geosam` library directly** (see the migration section below). The legacy package is no longer supported by the Geo-SAM QGIS plugin v2.0+; it is only compatible with plugins older than v2.0. ``` --- ## Migration to `geosam` (recommended) Install the `geosam` library with all dependencies (geospatial stack + SAM models): ```bash pip install "geosam[all]" ``` Then use it directly in your own scripts. The recommended starting point is the [`FeatureCacheBuilder`](https://github.com/Fanchengyan/geosam) helper, which handles tiling, encoding, and manifest writing for you: ```python from geosam import RasterDataset, FeatureCacheBuilder from geosam.models import ModelSpec dataset = RasterDataset("image.tif", crs="EPSG:32650") model_spec = ModelSpec( model_type="sam2", # "sam" | "sam2" | "sam3" checkpoint_path="/path/to/model.pth", ) builder = FeatureCacheBuilder( dataset, model_spec, output_dir="features", chip_size=1024, # default: 1024 SAM/SAM2, 1008 SAM3 stride=512, # 50% overlap (recommended) ) manifest_path = builder.build() ``` `builder.build()` slices the raster into overlapping chips, runs the SAM image encoder on each chip, saves the features under `features/features/`, and writes `features/manifest.parquet`. The resulting folder can be loaded directly by the Geo-SAM plugin (see the [Pre-encoded segmentation](pre_encoded_segmentation.md) page) or by `FeatureQueryEngine` in your own scripts. ### What each part does — and how to customize it **`RasterDataset` — open the raster** `RasterDataset` accepts any path readable by rasterio/GDAL: GeoTIFF, COG, JP2, or virtual filesystem paths like `/vsicurl/https://...`. If the path is wrong or the file cannot be opened, it raises `FileNotFoundError`. - **`crs`** is optional. Set it only when you want the raster reprojected on the fly (for example, reproject a lat/lon image to UTM so you can work in meters). Leave it out to keep the source CRS. - **`indexes`** selects which bands SAM will see. Band numbers start at 1. If you don't set it, every band in the file is used. Common cases: ```python RasterDataset("image.tif") # all bands RasterDataset("image.tif", indexes=[1]) # single-band (SAR, DEM, pan) RasterDataset("image.tif", indexes=[1, 2, 3]) # RGB RasterDataset("image.tif", indexes=[4, 1, 2]) # custom order, e.g. NIR-R-G ``` Whatever you pick, the encoder reshapes it to exactly 3 channels for SAM: 1-band is copied to 3, more than 3 bands take the first 3. So for a multispectral image, set `indexes` explicitly to feed SAM the bands you want it to see. **`ModelSpec` — choose the model** The `model_type` → checkpoint mapping: | `model_type` | Checkpoints | | --- | --- | | `"sam"` | SAM v1 (`sam_vit_b_01ec64.pth`, `sam_vit_l_0b3195.pth`) | | `"sam2"` | SAM2 / SAM2.1 (`sam2_hiera_*.pt`, `sam2.1_hiera_*.pt`) | | `"sam3"` | SAM3 (`sam3.pt`) | If you don't want to remember the mapping, let geosam infer it from the checkpoint filename: ```python from geosam.runtime import create_model_spec_from_checkpoint model_spec = create_model_spec_from_checkpoint("/path/to/model.pth") ``` **`FeatureCacheBuilder` — tile and encode** - **`chip_size`** — omit it to use the model's native image size (the default). You'd only set it explicitly to force a different window; for example, encoding very large rasters on a GPU with limited memory. - **`stride`** is the step between windows in pixels: - `stride = chip_size` → no overlap. Smaller cache, but objects sitting on a tile border may be missed. - `stride = chip_size // 2` → 50% overlap. This is what the Geo-SAM plugin uses by default and is recommended. - Smaller stride = more chips = more disk space and longer encode time, but better border coverage. - **`output_dir`** is where the feature folder is written. After `builder.build()` finishes it contains: ``` features/ ├── features/ │ ├── chip_000000.pt │ ├── chip_000001.pt │ └── ... └── manifest.parquet ``` The `manifest.parquet` is required — without it the plugin cannot open the feature folder. `FeatureCacheBuilder` writes it for you; if you ever build a feature folder by hand, remember to write one too. See the [`geosam` documentation](https://github.com/Fanchengyan/geosam) for the full API reference, including the [`OnlineQueryEngine`](https://github.com/Fanchengyan/geosam) for single-chip interactive queries and the [`FeatureCacheBuilder`](https://github.com/Fanchengyan/geosam) for batched pre-encoding. --- The rest of this page documents the **legacy** `GeoSAM-Image-Encoder` package for reference. These workflows are only compatible with Geo-SAM plugin versions older than v2.0. ## Legacy Overview The `GeoSAM-Image-Encoder` package was a standalone Python package that did not depend on QGIS. It allowed you to encode remote sensing images into features on a remote server (e.g., Colab or AWS) and then load them in the Geo-SAM QGIS plugin. ## Legacy Installation ```{admonition} Install PyTorch first :class: note Installing `GeoSAM-Image-Encoder` directly installs the CPU version of PyTorch. Install the appropriate PyTorch version first from . ``` ```bash pip install GeoSAM-Image-Encoder # or from source pip install git+https://github.com/Fanchengyan/GeoSAM-Image-Encoder.git ``` ## Legacy Python Usage ```python import geosam from geosam import ImageEncoder # check GPU availability geosam.gpu_available() # encode by direct parameters checkpoint_path = "/content/sam_vit_l_0b3195.pth" image_path = "/content/beiluhe_google_img_201211_clip.tif" feature_dir = "./" img_encoder = ImageEncoder(checkpoint_path) img_encoder.encode_image(image_path, feature_dir) ``` ### Using a Settings JSON File ```python import geosam from geosam import ImageEncoder setting_file = "/content/setting.json" feature_dir = "./" settings = geosam.parse_settings_file(setting_file) settings.update({"feature_dir": feature_dir}) init_settings, encode_settings = geosam.split_settings(settings) img_encoder = ImageEncoder(**init_settings) img_encoder.encode_image(**encode_settings) ``` ## Legacy Terminal Usage ```bash image_encoder.py -i /content/image.tif -c /content/checkpoint.pth -f ./ # override settings from a file image_encoder.py -s /content/setting.json -f ./ --stride 256 --value_range "10,255" # see all options image_encoder.py -h ``` ## Legacy Colab Example The original Colab notebook is still available: