Dataset details
- Randomly collected over a year period (between 02.2024 and 03.2025) from Data Quality and Data Operations related job listings posted online in Israel (or Israel remote).
- Labelled with taxonomy canonical titles when applicable
Link to dataset
Labelled Job Descriptions v1
CSV fields
- true labels: one or more job title labels (pipe-separated “|”) for the job description and raw title. True labels can be taxonomy titles, but not necessarily.
- exact or partial canonical title match: one or more canonical title labels from the taxonomy (pipe-separated “|”) , for the job description and raw title.
- raw job title: the job title as it appears in the original listing.
- company: the company that posted the original listing.
- job description: the job description as it appears in the original listing.
- link: the URL where this listing was found.
- notes: for internal purposes.
- missing canonical title: canonical title not in the taxonomy (for internal purpose)
- missing alternative alias: alias not in the taxonomy (for internal purpose)
Guidelines
- Read the listing’s raw job title and job description.
- Label with one or more job title labels from the taxonomy’s canonical titles that best describes the listing; if unavailable, enter a job title label of your own or the listing’s raw title.
- Labeling best practices:
- Carefuly read both the job listing title and the job description.
- Label with a canonical title as specific and accurate as possible
- You can label multiple titles:
- If more than one title fit the job listing.
- If you are unsure whether your title label fully describes the job listing, you can label both the specific title (canonical_title) and its parent title (parent_title).
- Generic titles (marked as generic_title=yes in the taxonomy):
- Given their highly ambiguous nature, try to avoid labeling with generic titles, or if unavoidable, provide multiple labels.
- For example, sometimes a listing raw title fully matches a generic taxonomy title such as “Data Specialist”. This listing will be also labeled with a more specific taxonomy canonical title, depending on the requirements from the job description.
- Labeling examples:
- When a job listing’s main requirements’ specialties reside in different branches of the taxonomy, you can use multiple labels to reflect that. For example, for a role that focuses on data research and quality, but also requires NLP data analysis skills, you can annotate it as “Data Research Analyst | NLP Data Analyst”.
- A “Data Expert” listing from the company Ask.ai was labelled as “ML Data Analyst”, as they specify the Data Expert is something in-between a Data Analyst and a Data Scientist, being in charge of analyzing data and controlling quality, but also of building and refining prompts using ML techniques.