OpenFDA-Semantic-MCP

mcp
Guvenlik Denetimi
Uyari
Health Uyari
  • No license — Repository has no license file
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 6 GitHub stars
Code Gecti
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This is a Model Context Protocol (MCP) server that allows AI agents to query the public openFDA API ecosystem. It translates complex API interactions into intent-driven commands, enabling LLMs to securely retrieve and summarize healthcare data like drug labels, adverse events, and device clearances without overwhelming the AI's context window.

Security Assessment
The overall risk is Low. The server makes outbound network requests exclusively to the public openFDA databases. It does not execute local shell commands, requires no dangerous system permissions, and the source code contains no hardcoded secrets or dangerous patterns. Because it interacts with a public external API rather than local files or internal systems, the potential attack surface for your local environment is very minimal.

Quality Assessment
The project appears to be actively maintained, with its most recent code push occurring today. The codebase shows strong architectural practices, utilizing domain-driven design and strict Pydantic models to prevent AI hallucinations and handle API errors gracefully. However, there are significant concerns regarding maturity and legal safety. The repository entirely lacks a software license, which legally restricts how developers can use, modify, or integrate the code. Additionally, it has extremely low community visibility with only 6 stars, meaning the code has not been broadly tested or vetted by a wider audience.

Verdict
Use with caution: the code is technically safe and well-architected, but the complete absence of an open-source license and minimal community adoption present legal and long-term reliability risks.
SUMMARY

A production-grade, AI-native Model Context Protocol (MCP) server providing LLMs with comprehensive, intent-driven access to the entire openFDA API Ecosystem.

README.md

OpenFDA FastMCP Server 🧬🤖

A production-grade, AI-native Model Context Protocol (MCP) server providing LLMs with comprehensive, intent-driven access to the entire openFDA API Ecosystem.

Built for modern GenAI orchestrators (like Claude Desktop) operating in the Healthcare and Life Sciences (HCLS) domain, this server moves beyond basic 1:1 API proxies. It utilizes Domain-Driven Design (DDD), semantic aggregation, and stateful cursor pagination to allow AI Agents to interrogate millions of FDA adverse events, medical device clearances, and drug labels securely and reliably.


🎯 Key Capabilities & Highlights

  • 100% Endpoint Coverage: Full integration with FDA Drug, Device, Food, Cosmetics, Animal/Veterinary, Tobacco, Transparency, and non-clinical (NSDE/UNII) datasets.
  • Intent-Driven Architecture (HLD): Groups disparate API interactions into high-level LLM capabilities (e.g., analyze_drug_profile) rather than low-level database lookups.
  • Graceful "Zero Match" Intercepts: Naturally handles extreme edge-case hallucinated queries by catching 404: No Matches Found backend errors and feeding the LLM clean, empty arrays ([]) instead of throwing stack-trace crashes.
  • Stateful Cursor Pagination (LLD): Safely bypasses the openFDA 26,000-record skip limit. The MCP Server extracts search_after tokens from the Link header, allowing LLMs to seamlessly paginate through 15+ million adverse event records without token bloat.
  • Anti-Hallucination Schemas: Strict Pydantic models leverage Literal[] typeings bounding all valid endpoints (verified directly via web scrapes of open.fda.gov documentation) to guarantee AI agents never hallucinate nonexistent slugs.

🏗️ Technical Architecture (HLD & LLD)

High-Level Design (HLD)

  1. Semantic Aggregation & Progressive Disclosure:
    Rather than handing raw, multi-megabyte JSON payloads to LLMs (which explodes context windows), tools like analyze_drug_profile execute concurrent operations parsing Adverse Events and Recall Enforcements simultaneously. The server extracts the openfda branding headers, boils down the data to top-hit summaries, and returns structured profiles.
  2. Domain-Driven Repository Structure:
    The codebase strictly separates sub-domains (e.g., domains/drug, domains/device, domains/food). This isolates logic so changes to the complex 510(k) device reporting structure don't risk breaking Cosmetic Adverse Event tools.

Low-Level Design (LLD)

  1. Client URL Syntax Builder (utils/syntax_builder.py):
    Standard Python HTTP clients auto-encode + into %2B. OpenFDA strictly blocks %2B and demands literal +AND+ delimiters for multi-field queries. A custom query builder patches standard URL encoding to satisfy openFDA's esoteric Lucene syntax.
  2. Pydantic Guardrailing & Docstring Routing (schemas.py & tools.py):
    Each tool exposes highly explicit Python docstrings uniquely optimized for "Agentic Routing" (e.g., explaining precisely when to use the Cosmetic tool vs the Transparency tool). Pydantic rigorously validates inputs before hitting the HTTP client to ensure optimal API compute behavior.
  3. Async Core (core/client.py):
    Fully asynchronous httpx engine handling concurrency, rate-limits, and SSL verification profiles natively matching FastMCP event loops.

🛠️ Provided Tools

This server injects the following tools into your LLM orchestrator:

Domain Tool Name Agentic Use Case
Drugs analyze_drug_profile Summarize total adverse hits & critical recalls for a specific drug.
Drugs search_drug_labels Search active ingredients, warnings, and boxed alerts in pill labels.
Drugs search_drug_power_query Deep-dive raw searches against generic event datasets.
Devices evaluate_medical_device Correlate a 3-letter FDA Product Code against 510(k) clearances.
Devices search_device_power_query Raw search across PMAs, UDIs, 510(k)s, and recalls.
Food & Cosmetics search_food_events & search_cosmetic_events Investigate salmonella outbreaks, product anomalies, and cosmetic injuries.
Veterinary search_animal_events Interrogate veterinary responses across dog/cat species or flea medications.
Tobacco search_tobacco_data Query prevention/digital ad studies and problem reports.
Transparency search_transparency_data Access FDA Complete Response Letters (CRLs).
Other search_other_data Retrieve historical press releases and substance UNII codes.

🚀 Quickstart & Installation

This project is orchestrated using uv for ultra-fast, modern Python dependency generation.

  1. Clone & Install

    git clone https://github.com/yourusername/openfda_mcp.git
    cd openfda_mcp
    uv pip install -e .
    
  2. Configure External Clients (Claude Desktop)
    Add the following to your claude_desktop_config.json:

    {
      "mcpServers": {
        "openfda_server": {
          "command": "uv",
          "args": [
            "--directory",
            "/absolute/path/to/openfda_mcp",
            "run",
            "fastmcp",
            "run",
            "-m",
            "openfda_mcp.main:mcp",
            "--transport",
            "stdio"
          ]
        }
      }
    }
    
  3. Running the Server Manually (Streamable HTTP)
    If you wish to test or host the server natively over Server-Sent Events (SSE) instead of stdio:

    uv run fastmcp run src/openfda_mcp/main.py:mcp --transport streamable-http --port 8000
    

💬 Example LLM Prompts & Workflows

Once mounted in your Agent framework, you can formulate zero-shot biomedical queries natively:

Drug Analysis:

"Use OpenFDA to analyze the safety profile of 'Acetaminophen'. Are there any major recalls associated with it?"
The agent will automatically use analyze_drug_profile, formatting output gracefully without flooding the UI context window.

Medical Device Investigation:

"Look up FDA Product Code DTQ. How many 510(k) clearances does it have, and what are the top applications?"

Cosmetic Research:

"Query the cosmetic events dataset to find if there are any reported cancer outcomes related to Talcum or Baby Powder products from 2015-2020."


Security Note

Local SSL verification is bypassable inside config.py defaults to support local MacOS proxy firewall testing. Ensure OPENFDA_API_KEY is injected via environment variable to support bursts over 40 requests/minute in production arrays.

Yorumlar (0)

Sonuc bulunamadi