PhD Position Deep Web Entity Monitoring
University of Twente - Faculty of Electrical Engineering, Mathematics and Computer Science (EWI)
De uitdaging / The challenge
A large part of the web, the invisible web or deep web, cannot be indexed by web crawlers, for instance dynamic web pages that are returned in response to filling in a web form, or performing a search in a search engine. Instead of crawling deep web data, the approach will monitor web pages for certain (types of) queries. The objective is to develop approaches for monitoring web data that allow users to see a page's full history of relevant/important changes by identifying entities: people, organizations, products, geographic locations, events, etc. The approach should relate changes in multiple web sites, giving the user a data-warehouse-like overview of the pages they monitor; drilling down to time periods, persons, events, etc.
A large part of the web, the invisible web or deep web, cannot be indexed by web crawlers, for instance dynamic web pages that are returned in response to filling in a web form, or performing a search in a search engine. Instead of crawling deep web data, the approach will monitor web pages for certain (types of) queries. The objective is to develop approaches for monitoring web data that allow users to see a page's full history of relevant/important changes by identifying entities: people, organizations, products, geographic locations, events, etc. The approach should relate changes in multiple web sites, giving the user a data-warehouse-like overview of the pages they monitor; drilling down to time periods, persons, events, etc.
The research will be done in co-operation with WCC. WCC, www.wcc-group.com, started in 1996 and is a successful software company based in Utrecht (NL) and Reston (USA). WCC's current focus areas are the Employment and Identification Security markets. Both commercial and government customers worldwide use WCC's smart search & match solutions to support their primary processes. Both WCC and the Database Group of the University of Twente have made significant advances in entity matching and entity ranking applied to for instance Employment Matching and Expert Search. This project will extend this work to monitoring of deep web pages, such a social networking sites, micro-blogging sites, job sites, etc. The candidate will spend part of the time at WCC in Utrecht.
The research is part of a larger national research project on Information Retrieval for Information Services, (COMMIT P1). Validation of the research will be done in co- operation with the partners in the national project, using developed prototypes and fields trials in real-life settings.
Our offerWe offer a full-time PhD position in a dynamic and international environment. The duration of the contracts is four years. The gross salary will range from € 2.042,00 in the first year to € 2.612,00 per month in the fourth year. Additionally, the University of Twente provides excellent facilities for professional and personal development. In addition, we offer a holiday allowance (amounts to 8%) and an end-of-year bonus of 8.3% (which is approximately an extra month of salary) and a number of additional benefits. The labor agreements are in accordance with the CAO-NU for Dutch universities.
Your profileWe are looking for a candidate with a MSc. degree in computer science or an equivalent degree. The work involves both theory and practice (implementing a research prototype and evaluation). The candidate should have good programming skills. Affinity with web technology, information retrieval, and information extraction are desirable. The candidate must be a team player and must have good communication skills, in writing as well as oral.
Please use this email address only for obtaining information!
Dr. D. Hiemstra, email d.hiemstra@utwente.nl.
Dr. D. Hiemstra, email d.hiemstra@utwente.nl.
If you can meet the above mentioned requirements, you are invited to submit your application letter together with:
- curriculum vitae
- a list of publications
- a list of courses you have followed (with grades)
- the names and addresses of two referees
- one or two of your research papers and/or (a link to) your MSc thesis before June 12, 2011, through the application link on our website.
- a list of publications
- a list of courses you have followed (with grades)
- the names and addresses of two referees
- one or two of your research papers and/or (a link to) your MSc thesis before June 12, 2011, through the application link on our website.
The organisationThe University of Twente is the only campus university in the Netherlands; divided over six faculties we provide more than fifty educational programmes. The University of Twente has a strong focus on personal development and talented researchers are given scope for carrying out pioneering research.
In the Faculty of Electrical Engineering, Mathematics and Computer Science (EWI) is the place where these three disciplines shape Information and Communication Technology at the University of Twente. ICT is more than communication. In almost every product we use mathematics, electronics and computer technology are present, and ICT now contributes to all of societies' activities. It is no longer about gadgets that we do or do not want to use. The real integration of ICT is just beginning. What we need is more intelligence in our systems and improvement of their natural interaction with humans.
To apply for this position, visit AcademicTransfer for the original posting, and many other scientific vacancies.
No comments:
Post a Comment