A semantic search implementation that can help you find a job online


IIT Jobs is a job board that dissolves the geographical boundaries and helps people finding better avenues in life.
For IIT Jobs, we developed a web application that looked for relevant online job postings for your desired role & location.

IIT Jobs


IIT Jobs wanted to develop a semantic search solution to help further people find better job opportunities irrespective of its nature and location.


We divided work into four modules

  1. Search Module
  2. Scraper Module
  3. Information Extraction Module
  4. Application Module


Search Module

  • The user's search query gives input to the Search module, which is searched on the Internet to obtain a list of URLs (mostly job postings).
  • The Search Module intelligently identifies if a given link refers to a web page having a job postings list or a single job posting.

Scraper Module

  • The Scraper Module efficiently scrapes data from any website given as input to the module.
  • Any conventional Scraping service would return output in the form of lines; however, the Scraper module identifies this as one chunk of data, thereby allowing better information extraction.

Information Extraction Module

  • The Information Extraction Module performs intelligent information retrieval on the scraper module's data scraped by eliminating unwanted page content like sidebar, navigation bar, or menu or anything irrelevant for our ML model.
  • For supporting data point identification, Named Entity Recognizer is implementing a comprehensive dictionary for some of the data points like a Job title.

Application Module

  • The Application Module is the user interaction module that takes user input in the search query and carries out all the processing efficiently.
  • Unlike popular job portals, this portal does not require a user to specify the location, job title, or company.


The newly developer job portal brings you the results scraped directly from the Internet that is best suited to your needs.

For future improvements, we plan to improve information extraction techniques, introduce a chat mechanism to refine search results & provide an option to let users add multiple resumes.

