Discovery - Lerian

Discovery automates data source detection and extraction through Fetcher. Instead of manually uploading files, Discovery connects to external systems, identifies available data, and extracts transactions directly into Matcher.

What Discovery solves

Manual file uploads create friction at every step. Teams export files, transfer them, monitor for failures, and re-upload when something goes wrong. This process is time-consuming, error-prone, and breaks when data volume grows. Discovery replaces the manual pipeline. It connects to external systems through Fetcher, detects available data sources automatically, and pulls transactions into Matcher on demand. When a new data source appears — a new bank connection, a new payment processor — Discovery finds it without reconfiguration.

How Discovery works

Discovery operates through Fetcher, Lerian’s internal data-extraction service. Fetcher manages connections to external databases and extracts data from them on behalf of Lerian products. Discovery exposes those connections to Matcher and coordinates the extraction process. The workflow has seven steps:

Check status — Confirm Fetcher is connected and Discovery is available.
Browse connections — See all data sources Fetcher has access to.
Inspect a connection — Review the schema to understand what fields are available.
Test a connection — Validate the connection before committing to an extraction.
Create an extraction — Request that Matcher pull data from a specific source.
Monitor progress — Track extraction status as data flows in.
Refresh connections — Rescan when new data sources are added.

Discovery workflow

Check Discovery status

Verify that Fetcher is connected and Discovery is operational before starting.

curl -X GET "https://api.matcher.example.com/v1/discovery/status" \
  -H "Authorization: Bearer $TOKEN"

API Reference: Get Discovery status

Browse connections

List all data sources available through Fetcher.

curl -X GET "https://api.matcher.example.com/v1/discovery/connections" \
  -H "Authorization: Bearer $TOKEN"

The response lists each connection with its name, type (database, API, file store), and current status.

API Reference: List connections

Inspect a connection

Review the schema of a specific connection to understand what data fields are available before extracting.

curl -X GET "https://api.matcher.example.com/v1/discovery/connections/{connectionId}/schema" \
  -H "Authorization: Bearer $TOKEN"

Use schema inspection to confirm that required fields — transaction IDs, amounts, dates, references — exist before building field mappings.

API Reference: Get connection schema

Test a connection

Validate that Matcher can reach and read from a connection before creating an extraction.

curl -X POST "https://api.matcher.example.com/v1/discovery/connections/{connectionId}/test" \
  -H "Authorization: Bearer $TOKEN"

A successful test confirms connectivity and read access. Always test before creating an extraction — especially for new or recently modified connections.

API Reference: Test connection

Create an extraction

Request that Matcher pull transaction data from a specific connection into the current context.

curl -X POST "https://api.matcher.example.com/v1/discovery/connections/{connectionId}/extractions" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "contextId": "ctx_abc123",
    "dateRange": {
      "from": "2024-01-01",
      "to": "2024-01-31"
    }
  }'

The response returns an extraction ID. Use it to monitor progress.

API Reference: Create extraction

Monitor extraction progress

Track the status of an active extraction. For large datasets, use the poll endpoint to check progress incrementally.

# Get current status
curl -X GET "https://api.matcher.example.com/v1/discovery/extractions/{extractionId}" \
  -H "Authorization: Bearer $TOKEN"

# Poll for updates
curl -X POST "https://api.matcher.example.com/v1/discovery/extractions/{extractionId}/poll" \
  -H "Authorization: Bearer $TOKEN"

Extraction status transitions from pending → running → completed (or failed). The response includes a count of records extracted and any errors encountered.

API Reference: Get extraction · Poll extraction

Refresh available connections

When new data sources are added to Fetcher, trigger a refresh so Discovery picks them up.

curl -X POST "https://api.matcher.example.com/v1/discovery/refresh" \
  -H "Authorization: Bearer $TOKEN"

API Reference: Refresh connections

Best practices

Always test connections before extracting

A failed extraction mid-run is harder to recover from than a failed test. Test every connection before creating an extraction — especially when connecting to a new source or after a credential rotation.

Inspect schemas before mapping fields

Field names vary across systems. A bank might call the transaction date value_date while your ledger uses posting_date. Check the schema before configuring field mappings to avoid silent mismatches.

Monitor extractions actively for large datasets

Large extractions take time. Don’t assume completion — poll the extraction status and confirm the record count before starting a match run. Starting a run on incomplete data generates incorrect exceptions.

Refresh connections when sources change

Discovery doesn’t scan for new connections automatically. When a new payment processor is added or a new database is onboarded to Fetcher, trigger a refresh. Otherwise, Discovery won’t show the new source.

Scope extractions to the reconciliation period

Use date range parameters to extract only the data relevant to the current reconciliation period. Extracting unbounded data increases processing time and may pull records that belong to already-closed contexts.

In multi-tenant mode, Matcher authenticates with Fetcher using per-tenant machine-to-machine (M2M) credentials. These credentials are managed through AWS Secrets Manager and cached automatically. See Multi-Tenant Mode for configuration details.

Next steps

External sources

Configure the external data sources that Discovery connects to.

Field mapping

Map fields from extracted data to Matcher’s transaction model.

Discovery API reference

Full API reference for Discovery endpoints.

​What Discovery solves

​How Discovery works

​Discovery workflow

​Check Discovery status

​Browse connections

​Inspect a connection

​Test a connection

​Create an extraction

​Monitor extraction progress

​Refresh available connections

​Best practices

​Next steps

External sources

Field mapping

Discovery API reference

What Discovery solves

How Discovery works

Discovery workflow

Check Discovery status

Browse connections

Inspect a connection

Test a connection

Create an extraction

Monitor extraction progress

Refresh available connections

Best practices

Next steps