This bounty is no longer available
Web3 DAO | wslyvh Logo

Angelist jobs importer/scraper

Organization

wslyvh

Deadline

in over 262 years

Status

ENDED

800 USD

INSTRUCTIONS

I have several handlers that integrate with application tracking systems (ATS) and import jobs into useWeb3 https://www.useweb3.xyz/jobs

Current integrations incl. BreezyHR, Greenhouse, Lever, Workable and Wrk.xyz and a handler for jobs manually added on Airtable. Implementations can be found at. https://github.com/wslyvh/useWeb3/tree/main/src/services/jobs

The Angelist handler should be implemented in a similar structure.

It should be written in Typescript. No preferences for other tools/frameworks/etc.

Can leverage existing (open-source) work. There might be a few tools available already?

With captcha/bot protections, this will likely need to be run as an asynchronous/background process. The process should fetch all jobs and write this to somewhere. This could be a db (postgress/supabase) or just a JSON dump to the filesystem. A handler just like above (e.g. Greenhouse, Airtable, etc.) could then pick up the jobs from that data source.

Angelist does NOT have an API, so it would require another way (e.g. web scraping) to pull in job information. This should include the following:

  • Job title
  • Short decription
  • Full description / body text (incl. basic formatting)
  • Location
  • Type (e.g. full-time)
  • Date posted / updated (e.g. 3 days ago)
  • Salary range

That likely means you need to scrape at least 2 pages. The job overview page and then follow to each detail page. If scraping is indeed the way to go, it should take into account bot/crawler protection and captcha verification.

The integration doesn't have include all the jobs from Angelist. It should work on per company basis. Some examples of companies and their jobs I'd like to import:

  • https://angel.co/company/gitcoin/jobs
  • https://angel.co/company/protocol-labs/jobs (with pagination)
  • etc..

The companies need to be configurable. Could be stored in a JSON config or constant in code. Also see current constants as an example https://github.com/wslyvh/useWeb3/blob/main/src/utils/constants.ts