Sections

You are here: Home » eiao » Observatory Arcitecture
 

Description of the Observatory architecture

  • Initially, an administrator populates a URL repository with web site URLs.
  • The crawler gets web site URLs from the URL repository and further populates the it with individual web page URLs. The crawler extracts at most 6000 pages from each site.
  • When the crawler is finished, the sampler selects 600 web pages from each site at random making a near random uniform sampling.
  • Each of the 600 pages are evaluated to detect accessibility barriers by the WAM and the results are stored in an RDF database.
  • The ETL extracts these results, transforms them and inserts the results for long storage in a data warehouse.
  • When all scheduled sites have been crawler, evaluated and loaded in the data warehouse, the data in the data warehouse is organised to be available in the online reporting tool.
  • Users can then see the results from the online reporting tool.