> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lerian.studio/llms.txt
> Use this file to discover all available pages before exploring further.

# Discovery

> Use Discovery and Fetcher to detect external data sources, inspect their schemas, and pull transactions into Matcher automatically.

Discovery automates data source detection and extraction through Fetcher. Instead of manually uploading files, Discovery connects to external systems, identifies available data, and extracts transactions directly into Matcher.

## What Discovery solves

***

Manual file uploads create friction at every step. Teams export files, transfer them, monitor for failures, and re-upload when something goes wrong. This process is time-consuming, error-prone, and breaks when data volume grows.

Discovery replaces the manual pipeline. It connects to external systems through Fetcher, detects available data sources automatically, and pulls transactions into Matcher on demand. When a new data source appears — a new bank connection, a new payment processor — Discovery finds it without reconfiguration.

## How Discovery works

***

Discovery operates through Fetcher, Lerian's data ingestion service. Fetcher manages connections to external systems: databases, APIs, file stores, and banking platforms. Discovery exposes those connections to Matcher and coordinates the extraction process.

The workflow has seven steps:

1. **Check status** — Confirm Fetcher is connected and Discovery is available.
2. **Browse connections** — See all data sources Fetcher has access to.
3. **Inspect a connection** — Review the schema to understand what fields are available.
4. **Test a connection** — Validate the connection before committing to an extraction.
5. **Create an extraction** — Request that Matcher pull data from a specific source.
6. **Monitor progress** — Track extraction status as data flows in.
7. **Refresh connections** — Rescan when new data sources are added.

## Discovery workflow

***

### Check Discovery status

Verify that Fetcher is connected and Discovery is operational before starting.

```bash theme={null}
curl -X GET "https://api.matcher.example.com/v1/discovery/status" \
  -H "Authorization: Bearer $TOKEN"
```

<Tip>API Reference: [Get Discovery status](/en/reference/matcher/discovery-status)</Tip>

### Browse connections

List all data sources available through Fetcher.

```bash theme={null}
curl -X GET "https://api.matcher.example.com/v1/discovery/connections" \
  -H "Authorization: Bearer $TOKEN"
```

The response lists each connection with its name, type (database, API, file store), and current status.

<Tip>API Reference: [List connections](/en/reference/matcher/list-discovery-connections)</Tip>

### Inspect a connection

Review the schema of a specific connection to understand what data fields are available before extracting.

```bash theme={null}
curl -X GET "https://api.matcher.example.com/v1/discovery/connections/{connectionId}/schema" \
  -H "Authorization: Bearer $TOKEN"
```

Use schema inspection to confirm that required fields — transaction IDs, amounts, dates, references — exist before building field mappings.

<Tip>API Reference: [Get connection schema](/en/reference/matcher/get-connection-schema)</Tip>

### Test a connection

Validate that Matcher can reach and read from a connection before creating an extraction.

```bash theme={null}
curl -X POST "https://api.matcher.example.com/v1/discovery/connections/{connectionId}/test" \
  -H "Authorization: Bearer $TOKEN"
```

A successful test confirms connectivity and read access. Always test before creating an extraction — especially for new or recently modified connections.

<Tip>API Reference: [Test connection](/en/reference/matcher/test-discovery-connection)</Tip>

### Create an extraction

Request that Matcher pull transaction data from a specific connection into the current context.

```bash theme={null}
curl -X POST "https://api.matcher.example.com/v1/discovery/connections/{connectionId}/extractions" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "contextId": "ctx_abc123",
    "dateRange": {
      "from": "2024-01-01",
      "to": "2024-01-31"
    }
  }'
```

The response returns an extraction ID. Use it to monitor progress.

<Tip>API Reference: [Create extraction](/en/reference/matcher/create-extraction)</Tip>

### Monitor extraction progress

Track the status of an active extraction. For large datasets, use the poll endpoint to check progress incrementally.

```bash theme={null}
# Get current status
curl -X GET "https://api.matcher.example.com/v1/discovery/extractions/{extractionId}" \
  -H "Authorization: Bearer $TOKEN"

# Poll for updates
curl -X POST "https://api.matcher.example.com/v1/discovery/extractions/{extractionId}/poll" \
  -H "Authorization: Bearer $TOKEN"
```

Extraction status transitions from `pending` → `running` → `completed` (or `failed`). The response includes a count of records extracted and any errors encountered.

<Tip>API Reference: [Get extraction](/en/reference/matcher/retrieve-extraction) · [Poll extraction](/en/reference/matcher/poll-extraction)</Tip>

### Refresh available connections

When new data sources are added to Fetcher, trigger a refresh so Discovery picks them up.

```bash theme={null}
curl -X POST "https://api.matcher.example.com/v1/discovery/refresh" \
  -H "Authorization: Bearer $TOKEN"
```

<Tip>API Reference: [Refresh connections](/en/reference/matcher/refresh-discovery)</Tip>

## Best practices

***

<AccordionGroup>
  <Accordion title="Always test connections before extracting">
    A failed extraction mid-run is harder to recover from than a failed test. Test every connection before creating an extraction — especially when connecting to a new source or after a credential rotation.
  </Accordion>

  <Accordion title="Inspect schemas before mapping fields">
    Field names vary across systems. A bank might call the transaction date `value_date` while your ledger uses `posting_date`. Check the schema before configuring field mappings to avoid silent mismatches.
  </Accordion>

  <Accordion title="Monitor extractions actively for large datasets">
    Large extractions take time. Don't assume completion — poll the extraction status and confirm the record count before starting a match run. Starting a run on incomplete data generates incorrect exceptions.
  </Accordion>

  <Accordion title="Refresh connections when sources change">
    Discovery doesn't scan for new connections automatically. When a new payment processor is added or a new database is onboarded to Fetcher, trigger a refresh. Otherwise, Discovery won't show the new source.
  </Accordion>

  <Accordion title="Scope extractions to the reconciliation period">
    Use date range parameters to extract only the data relevant to the current reconciliation period. Extracting unbounded data increases processing time and may pull records that belong to already-closed contexts.
  </Accordion>
</AccordionGroup>

<Note>
  In multi-tenant mode, Matcher authenticates with Fetcher using per-tenant machine-to-machine (M2M) credentials. These credentials are managed through AWS Secrets Manager and cached automatically. See [Multi-Tenant Mode](/en/matcher/configuration/matcher-multi-tenant#m2m-credentials-fetcher-integration) for configuration details.
</Note>

## Next steps

***

<CardGroup cols={2}>
  <Card title="External sources" icon="building-columns" href="/en/matcher/integrations/matcher-external-sources">
    Configure the external data sources that Discovery connects to.
  </Card>

  <Card title="Field mapping" icon="arrows-left-right" href="/en/matcher/configuration/matcher-field-mapping">
    Map fields from extracted data to Matcher's transaction model.
  </Card>

  <Card title="Discovery API reference" icon="code" href="/en/reference/matcher/discovery-status">
    Full API reference for Discovery endpoints.
  </Card>
</CardGroup>
