Find Jobs
Hire Freelancers

Scrape a website & insert into database & perform some tasks with the information

$250-750 CAD

종료됨
게시됨 거의 8년 전

$250-750 CAD

제출할때 지불됩니다
I need someone to write some software that will archive every listing posted on a particular website and use that information as described in the features section of this post. Basic logic of program: 1. Send a request to a website that returns listings in xml format 2. Check each listing against a mysql database 3. Send a web request to each new listing individually to get all the information 4. Features 1,2,3 (Explained in detail below) 5. Upload images from the listings to amazon S3 6. Add the information for each listing to a mysql database 7. Sleep before looping back to step 1 (Read feature 4) Limitations: The website is limited to a 20 listings at a time (Step 1). If all new listings are found, keep sending web requests for the next page of listings until previous listings are found, so no listings are missed. (During peak times it is possible for more than 20 listings to be posted between the minimum sleep period of 2 minutes) Features: 1. Create a table that tracks listings that are from the same user (by using two values found in the listing). Keep a tally of how many listings that user has posted and a tally of how many of those listings are unique (I suggest this is done on a separate thread as to not slow down the scraping). 2. If enabled, check each new listing's price against comparable listings on another website (web request to an api), and calculate the average value for comparable listings using the archive of listings in my database. Use some math calculations to decide if the listing is undervalued by a configurable amount/percent and send an alert (Amazon SNS and database entry). (This must be done on a separate thread as to not slow down the scraping) 3. Check each listing against search criteria, which can be configured by adding rows of criteria to a mysql database, and send an alert (Amazon SNS and database entry) if a new listing satisfies that criteria. (This will be simple criteria, such as if the listings price is >100, or if the listing is a specific model, etc). (This must be done on a separate thread as to not slow down the scraping) 4. Adjust the sleep time automatically as to minimize the amount of pages requested before finding previous listings (Explained in limitations). With a minimum sleep time of 2 minutes, a maximum of 15 minutes from 7AM - 11PM, and a maximum of 2 hours from 11PM-7AM, before looping. 5. Once daily check each active listing in the database against the website to see if the listing has been updated, or if the listing has been deleted. If it has been updated, save the changes to the database as a new row. If it has been deleted, change the status in the database so the listing will not be checked again. (I suggest this be a separate script ran by a cron job). Requirements: 1. Must run on a linux server 2. Error Handling (Website down, website responds with unexpected data, etc) 3. Log activity/errors in a text file. Send an alert if errors occur (Amazon SNS and entry into database) Program can be coded in any language that can run on a linux vps and take advantage of the multiple ip addresses the server has. PHP would be preferred.
프로젝트 ID: 10186611

프로젝트 정보

11 제안서
원격근무 프로젝트
활동 중 8년 전

돈을 좀 벌 생각이십니까?

프리랜서 입찰의 이점

예산 및 기간 설정
작업 결과에 대한 급여 수급
제안의 개요를 자세히 쓰세요
무료로 프로젝트에 신청하고 입찰할 수 있습니다
11 이 프로젝트에 프리랜서들의 평균 입찰은 $386 CAD입니다.
사용자 아바타
Hi, I have read the description & would like to discuss.. I have good web scraping experience & reviews. & can develop web scraping scripts in Python & C# Hope we can discuss details..
$250 CAD 3일에
5.0 (149 건의 리뷰)
6.8
6.8
사용자 아바타
We are a team (19 operator and 2 Quality checker)here from last 4 year giving all research service world wide with best quality output , I have gone through your project description, It is really a interesting job, and our operator are experienced enough in research skill so they easily can collect the data from several source, from a deep investigation, but its bit time consuming job not a copy paste. We would like to talk in details and give the total structure about how we ll do this job if you need. LETS TALK HERE FOR DUSCUSING THE JOB Thanks Dg
$250 CAD 10일에
4.7 (221 건의 리뷰)
7.1
7.1
사용자 아바타
I have reviewed your bid request and I am very interested in your project. I was trained overseas and have an extensive customer service record so contact me so we can discuss further or begin. I work in milestones and the "payment for time" option. If payment is by deliverables, then the milestones are 50% payment once the initial work/draft is done and the remaining can be paid if/when revisions are needed and completed. Bonuses welcomed and much appreciated. I've done many jobs on freelancer.com and hope for many with you and if nothing else add me to your coder list and notify me of your future jobs. Thanks.
$261 CAD 7일에
2.5 (1 건의 리뷰)
1.7
1.7
사용자 아바타
I have great expertise in web scraping in PHP. I have built up a personal library that lets me accomplish every request easily. I can handle sessions, proxies and avoid anti-scraping controls.
$250 CAD 3일에
0.0 (0 건의 리뷰)
0.0
0.0
사용자 아바타
I am New to Freelancer. But i have been working with a company and was working really good i have made few apps and done like more than 1K data entry projects and i have typing speed almost 95 WPM and can assure to complete your work and provide you with the best i can .
$277 CAD 10일에
0.0 (0 건의 리뷰)
0.0
0.0
사용자 아바타
A proposal has not yet been provided
$555 CAD 10일에
0.0 (0 건의 리뷰)
0.0
0.0

고객에 대한 정보

국기 (BANGLADESH)
Bangladesh
0.0
0
4월 11, 2016부터 회원입니다

고객 확인

감사합니다! 무료 크레딧을 신청할 수 있는 링크를 이메일로 보내드렸습니다.
이메일을 보내는 동안 문제가 발생했습니다. 다시 시도해 주세요.
등록 사용자 전체 등록 건수(일자리)
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
미리 보기 화면을 준비 중...
위치 정보 관련 접근권이 허용되었습니다.
고객님의 로그인 세션이 만료되어, 자동으로 로그아웃 처리가 되었습니다. 다시 로그인하여 주십시오.