Trace - Jobs Core API Documentation

Trace - Jobs Core API Documentation

← Return to Landing Page

Contents


Overview

Trace Jobs Core ingests job data daily from public machine‑readable endpoints and translates it into a clean, consistent format built on Schema.org. Every record is normalized only where the upstream source provides explicit structure, then canonicalized via RFC 8785 and committed to a content‑addressed store. Zero scraping, zero guesswork, maximum fidelity.

The API is designed for teams that need trustworthy job data without the noise of HTML parsing or the bias of heuristic “normalization.” You get stable IDs, predictable fields, and best‑effort ISO mapping where possible — with original upstream values preserved whenever a field is ambiguous or unmappable. This keeps the data faithful to the source and gives you full control over how to interpret it.


Best-effort Normalization

Trace Jobs Core normalizes fields only when the upstream source provides unambiguous, explicit, structured values. Everything else is preserved exactly as reported. This keeps the data faithful to the source and gives you full control over how to interpret ambiguous or missing fields.

We don’t guess. We don’t infer. We don’t rewrite upstream intent.

You get clean structure where it exists and raw truth where it doesn’t.

Optional Fields

If the upstream source omits a field, we omit it too. We do not fabricate values.

Why this matters

Every team has its own rules for interpreting ambiguous fields. By avoiding assumptions, we give you the freedom to apply your own logic, models, and heuristics — without fighting ours.


Endpoints

1 GET https://api.kaleh.net/trace/jobs/core/search

Executes a stateless query against the raw job posting index database.

1.1 Query Filter Logic

The index evaluates multi-value lookups based on key repetition:

The following block demonstrates a composite example:

?country=US&country=SG&employment_type=Fulltime
    
WHERE (country == "US" OR country == "SG")
  AND (employment_type == "Fulltime")

1.2 Query Parameters

All fields are optional, empty queries match the entire set. Array‑style filters are expressed by repeating the same key in the query string.

Parameter Type Functional Specification
title String Boost records which match on job title strings.
location String Boost records which match location string (e.g., Chicago, NSW, Japan).
industry String Boost records which match on structural sector tags (e.g., Aerospace).
company String Boost records which match company name (e.g., Accenture).
employment_type String Restrict result set by employment type: Fulltime, Permanent, Parttime, Contract, etc.
job_location_type String Restrict result set by job location type: Onsite, Hybrid, Remote.
language String Restrict result set by ISO 639-2 language designations (e.g., en).
country String Restrict result set by ISO 3166 2-letter geographical country designations (e.g., US, SG, MY).
region String Restrict result set by state, territory, or zone string (e.g., California, Penang).
city String Restrict result set by local municipal entity designation (e.g., San Carlos, Bangkok).
posted_after String Restrict results set to those posted after target date (YYYY-MM-DD)
currency String Restrict results set by ISO 4217 currency code
salary_min Float Restricts result set with minimum threshold on salary_max
salary_max Float Restricts result set with maximum threshold on salary_min
sortby String Targeting variable for sort: bm25, date_posted, salary_min, salary_max (default: bm25)
offset Integer Cursor location entry placement index. (default: 0)

1.3 Response Format

All responses return a simple JSON envelope containing the result set and the next pagination offset. The results array contains Schema.org JobPosting objects exactly as reported by the upstream source, with best‑effort normalization applied where structured fields exist.

{
"results": [...],
"next_offset": 12345
}

The envelope is stable, deterministic, and identical across all endpoints.

1.4 Record Field Coverage

Trace Jobs Core exposes only the fields that upstream structured sources provide. Schema.org defines many optional fields; most feeds do not supply them. The lists below reflect the fields you will commonly see in practice, not the full Schema.org surface area.

Field presence varies by source. If a field is not provided upstream, it will not appear in the record.

The fields below are the ones you will commonly see in the API.

1.4.1 Frequently Present Fields

These appear in most records across the index:

1.4.2 Less Frequently Present Fields

These appear when upstream sources include them, and many feeds omit them:

1.5 Pagination Model

The core engine uses explicit, offset-based iteration with a fixed page size of 12 reords. The payload includes a root-level next_offset integer indicating the precise integer index cursor for subsequent data blocks. To stream or loop records, read this value and pass it directly into your next request block.

The next_offset value is always an integer cursor, not a token.

1.6 Authentication

All requests must include an active subscription token in the standard authorization header. Tokens are matched statelessly against subscription logs.

X-API-Key: YOUR_API_KEY

That’s it. There are no sessions, no refresh flow, and no secondary handshakes.


HTTP Response Codes

Network responses indicate subscription validation status cleanly at the infrastructure gateway layer:


Rate Limits

Trace Jobs Core enforces simple, predictable per‑client rate limits. Each API key is allowed up to 50 requests per minute, with a small burst allowance for short‑lived spikes. Requests that exceed this threshold return a 429 Too Many Requests response.

Rate limits apply uniformly across all endpoints.

Limit Behavior

Recommended Client Behavior

Rate limits are designed to protect system stability while remaining generous enough for typical ingestion, enrichment, and experimentation workloads.

← Return to Landing Page


kaleh.net
Kaleh LLC, 2026