istat_mcp_server
MCP server to query Italian statistics (ISTAT) via SDMX API — compatible with any MCP client
ISTAT MCP Server
MCP server for accessing Italian statistical data from the ISTAT SDMX API.
Overview
This Model Context Protocol (MCP) server provides Claude Desktop with access to Italian statistical data from ISTAT (Istituto Nazionale di Statistica) through the SDMX REST API. It implements a two-layer caching mechanism to minimize API calls and provides eight tools for discovering, querying, and retrieving statistical data.
Features
8 MCP Tools for data discovery and retrieval:
discover_dataflows- Find available datasets by keywords (with blacklist filtering)get_structure- Get dimension definitions and codelists for a datastructure IDget_constraints- Get available constraint values for each dimension with descriptions (combines structure + constraints + codelist descriptions)get_codelist_description- Get descriptions in Italian/English for codelist valuesget_concepts- Get semantic definitions of SDMX conceptsget_data- Fetch actual statistical data in SDMXXML format (with blacklist validation)get_cache_diagnostics- Debug tool to inspect cache statusget_territorial_codes- Resolve ISTAT REF_AREA codes for Italia, ripartizioni, regioni, province, and comuni
Recommended Workflow (simple and efficient):
- Discover: Use
discover_dataflowsto find the dataflow you're interested in - Get Complete Metadata: Use
get_constraintsto see all dimensions with valid values AND descriptions in one call- This is the RECOMMENDED approach - one call instead of many
- Internally combines
get_structure+get_codelist_descriptionfor all dimensions - All data cached for 1 month → subsequent calls are instant
- Returns complete information ready for building filters in
get_data
- Fetch Data: Use
get_datawith the appropriate dimension filters to retrieve actual data
Alternative workflow (manual approach):
- Use
get_structurewith a datastructure ID to see dimensions and their codelists - Then call
get_codelist_descriptionmanually for each codelist you need - Use
get_conceptsif you need semantic definitions of dimensions/attributes
- Discover: Use
Two-Layer Cache:
- In-memory cache (cachetools) for fast access during session
- Persistent disk cache (diskcache) that survives restarts
Rate Limiting: Maximum 3 API calls per minute with automatic queuing
Retry Logic: Exponential backoff on transient errors
Dataflow Blacklist: Filter out specific dataflows from all queries
Installation
- Clone the repository:
git clone https://github.com/ondata/istat_mcp_server.git
cd istat_mcp_server
- Create a virtual environment and install dependencies (Python >=3.11 required):
With uv (recommended):
uv sync
uv sync automatically creates a .venv directory and installs all dependencies into it. To run commands manually, activate it first:
# Linux/macOS
source .venv/bin/activate
# Windows
.venv\Scripts\activate
With pip:
python -m venv .venv
# Linux/macOS
source .venv/bin/activate
# Windows
.venv\Scripts\activate
pip install -e .
- Create a
.envfile (optional, uses defaults if not present):
cp .env.example .env
Optional: for slow availableconstraint responses used by get_constraints, set:
AVAILABLECONSTRAINT_TIMEOUT_SECONDS=180
MCP Client Configuration
This server works with any MCP-compatible client. The sections below cover the most common ones.
Claude Desktop | Claude Code | Gemini CLI | VS Code | Codex CLI | Claude Desktop on Windows with Python on WSL2
In all examples, replace
/path/to/istat_mcp_serverwith the actual path to this directory, andpythonwithpython3if needed on your system.
Claude Desktop
Add to your Claude Desktop configuration file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"istat": {
"command": "python",
"args": ["-m", "istat_mcp_server"],
"cwd": "/path/to/istat_mcp_server"
}
}
}
Note: if python is not found in your system PATH, replace "python" in "command" with the absolute path to your Python executable (e.g. /usr/bin/python3 or C:\Python311\python.exe).
Claude Code
Add globally (available in all your projects):
claude mcp add -s user istat -- python -m istat_mcp_server --cwd /path/to/istat_mcp_server
Add for the current project only (creates or updates .mcp.json in the project folder):
claude mcp add istat -- python -m istat_mcp_server --cwd /path/to/istat_mcp_server
Or add manually to .mcp.json in your project root:
{
"mcpServers": {
"istat": {
"command": "python",
"args": ["-m", "istat_mcp_server"],
"cwd": "/path/to/istat_mcp_server"
}
}
}
-s usermakes the server available globally across all your projects. Without it, the server is scoped to the current project only.
Gemini CLI
Add globally:
gemini mcp add -s user istat -- python -m istat_mcp_server --cwd /path/to/istat_mcp_server
Or add manually to ~/.gemini/settings.json:
{
"mcpServers": {
"istat": {
"command": "python",
"args": ["-m", "istat_mcp_server"],
"cwd": "/path/to/istat_mcp_server"
}
}
}
VS Code
Add to your User Settings or .vscode/settings.json:
{
"mcpServers": {
"istat": {
"command": "python",
"args": ["-m", "istat_mcp_server"],
"cwd": "/path/to/istat_mcp_server"
}
}
}
Codex CLI
Add to ~/.codex/config.toml:
[mcp_servers.istat]
command = "python"
args = ["-m", "istat_mcp_server"]
cwd = "/path/to/istat_mcp_server"
Claude Desktop on Windows with Python on WSL2
If you run Claude Desktop on Windows but have Python and this server installed inside WSL2, use wsl.exe -e to bridge the two environments. Point to the Python executable inside your virtual environment:
{
"mcpServers": {
"istat": {
"command": "wsl.exe",
"args": [
"-e",
"/home/<your-user>/path/to/istat_mcp_server/.venv/bin/python",
"-m", "istat_mcp_server"
]
}
}
}
Replace /home/<your-user>/path/to/istat_mcp_server with the actual WSL path to this directory.
Note: Claude Code runs natively inside WSL2 and uses the standard configuration above. The
wsl.exewrapper is only needed for Claude Desktop running on the Windows side.
Skill (Recommended)
This project includes an Agent Skill in skills/istat-mcp/ that guides the model through the correct workflow step by step. It is strongly recommended to install the skill for a better experience: it reduces errors, avoids unnecessary API calls, and produces more accurate results.
Claude Code CLI
claude skills add ./skills/istat-mcp
Claude Desktop
- Open Claude Desktop
- Click the Settings icon (gear icon, bottom-left)
- Select Skills in the left sidebar
- Click "Add Skill"
- Browse to the
skills/istat-mcpfolder inside this repository and select it - The skill will appear in the list as istat-mcp — make sure it is enabled
Dataflow Blacklist Configuration
You can exclude specific dataflows from all queries using environment variables. This is useful for filtering out problematic or unwanted datasets.
Configuration via .env file
Add the DATAFLOW_BLACKLIST variable to your .env file:
# Exclude specific dataflows (comma-separated list)
DATAFLOW_BLACKLIST=149_577_DF_DCSC_OROS_1_1,22_315_DF_DCIS_POPORESBIL1_2
Behavior
- discover_dataflows: Blacklisted dataflows are automatically filtered out from results
- get_data: Attempts to fetch data from blacklisted dataflows will return an error message
Use Cases
- Exclude deprecated dataflows
- Filter out problematic datasets that cause errors
- Hide internal or test dataflows from users
Usage Examples
Once configured, you can ask Claude questions like:
Step 1: Discover dataflows
- "Show me all available dataflows about population"
- "Find dataflows related to agriculture"
Step 2: Get complete constraint information (RECOMMENDED)
- "Get constraints for dataflow 101_1015_DF_DCSP_COLTIVAZIONI_1"
- Returns all dimensions with valid values AND Italian/English descriptions
- One call instead of multiple
get_structure+get_codelist_descriptioncalls - Everything cached for 1 month
Step 2 Alternative: Explore structure and codelists manually
- "Show me the structure of datastructure DCSP_COLTIVAZIONI"
- "Get descriptions for codelist CL_ITTER107 to find Italian regions"
- "Show me all values in codelist CL_AGRI_MADRE for crop types"
Step 3: Fetch data with filters
- "Fetch population data for Italy from 2020 to 2023"
- "Get agricultural data for dataflow 101_1015_DF_DCSP_COLTIVAZIONI_1 filtered by REF_AREA=IT and TYPE_OF_CROP=APPLE"
Development
Run tests:
pytest
Format code:
ruff format .
Check code:
ruff check .
Project Structure
.
├── src/
│ └── istat_mcp_server/
│ ├── __init__.py
│ ├── __main__.py # Entry point for `python -m istat_mcp_server`
│ ├── server.py # MCP server initialization
│ ├── api/ # API client and models
│ │ ├── client.py # HTTP client with rate limiting
│ │ └── models.py # Pydantic models
│ ├── cache/ # Two-layer cache system
│ │ ├── manager.py # Cache façade
│ │ ├── memory.py # In-memory cache
│ │ └── persistent.py # Disk cache
│ ├── tools/ # MCP tool handlers
│ │ ├── discover_dataflows.py
│ │ ├── get_structure.py
│ │ ├── get_constraints.py
│ │ ├── get_codelist_description.py
│ │ ├── get_concepts.py
│ │ ├── get_data.py
│ │ └── get_cache_diagnostics.py
│ └── utils/ # Utilities
│ ├── logging.py
│ ├── validators.py
│ └── blacklist.py
├── tests/ # Test suite
├── cache/ # Runtime cache (git-ignored)
├── log/ # Log files (git-ignored)
├── .env.example
├── pyproject.toml
└── README.md
Cache Configuration
The server uses a sophisticated two-layer caching strategy:
- Memory Cache: Fast in-process cache with 5-minute TTL
- Persistent Cache: Disk-based cache with configurable TTLs:
- Dataflows: 7 days
- Structures/Codelists: 1 month
- Data: 1 hour
Relevant .env variables:
MEMORY_CACHE_TTL_SECONDS=300DATAFLOWS_CACHE_TTL_SECONDS=604800METADATA_CACHE_TTL_SECONDS=2592000OBSERVED_DATA_CACHE_TTL_SECONDS=3600AVAILABLECONSTRAINT_TIMEOUT_SECONDS=180
Cache is stored in the ./cache directory by default.
Logging and Debugging
The server automatically creates log files in the ./log directory with the following features:
- Automatic Rotation: Log files are rotated when they reach 10MB
- Retention: Last 5 log files are kept
- Dual Output: Logs are written both to file and stderr (for Claude Desktop logs)
Log Levels
Control the verbosity via the LOG_LEVEL environment variable in .env:
LOG_LEVEL=DEBUG # Maximum detail for debugging
LOG_LEVEL=INFO # Default, standard operations
LOG_LEVEL=WARNING # Only warnings and errors
LOG_LEVEL=ERROR # Only errors
Finding Logs
- Server Logs:
./log/istat_mcp_server.log - Claude Desktop Logs:
- Windows:
%APPDATA%\Claude\logs\ - macOS:
~/Library/Logs/Claude/
- Windows:
Debug Cache Issues
The log file shows:
- Cache directory path at startup
- Cache operations (with DEBUG level)
- API calls and retries
- Tool invocations
Use the get_cache_diagnostics tool in Claude Desktop to inspect cache status in real-time.
SDMX API Usage Notes
Rate Limiting
The ISTAT SDMX API is rate-limited to 3 calls per minute. The server automatically handles this by queuing requests when the limit is reached.
Accept Headers
The ISTAT SDMX API requires specific Accept headers depending on the endpoint and desired format. Using a generic application/json may return empty responses.
Data (CSV):
curl -H "Accept: application/vnd.sdmx.data+csv;version=1.0.0" \
"https://esploradati.istat.it/SDMXWS/rest/data/{dataflow_id}/ALL/"
Structure/Constraints (JSON):
curl -H "Accept: application/vnd.sdmx.structure+json; version=1.0" \
"https://esploradati.istat.it/SDMXWS/rest/availableconstraint/{dataflow_id}/all/all?mode=available"
Structure/Constraints (XML, default):
curl "https://esploradati.istat.it/SDMXWS/rest/availableconstraint/{dataflow_id}/all/all?mode=available"
License
MIT License
Contributing
Contributions are welcome! Please open an issue or pull request.
Author
- Vincenzo Patruno: https://www.linkedin.com/in/vincenzopatruno/
- Andrea Borruso: https://www.linkedin.com/in/andreaborruso
References
- ISTAT SDMX API: https://esploradati.istat.it/SDMXWS/rest/
- Model Context Protocol: https://modelcontextprotocol.io/
- Guide to the ISTAT SDMX API (in Italian): https://ondata.github.io/guida-api-istat/
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found