Update Script that Crawls iTunes and Inserts Data into Cassandra

종료 등록 시간: 3년 전 착불
종료 착불

I currently have a script that parses the iTunes API and puts the data into ElasticSearch and Cassandra databases. It crawls the RSS feeds twice per day. It checks iTunes for new Podcasts every day as well. Here is an example of an RSS feed that it parses.

[login to view URL]

So there are Podcasts which are like audio shows. And then each Podcast has multiple episodes. In other words, each Podcast has one RSS feed and each RSS feed shows the episodes for that podcast sorted by newest release date first.

The current developer of the script is not very responsive to making changes. So your job is to

1 - There are some parse errors for some of the podcast rss feeds.

2 - We are missing a lot of podcasts from iTunes. We can get some of those from another websites API.

3 - Setup data for each podcast regarding how often they release new episodes. We can determine their frequency by just looking at the RSS feed and storing the frequency in the database. For example for those that have a frequency of once per day or multiple times per week we should crawl every hour of the day or every few hours. For those that are once per week we should crawl maybe 2 - 4 times per day etc...If only once per month then we only crawl once per day.

I will give you the code so you can understand how it currently works.

The code is written in python. You can rewrite it in any other language that makes sense. It does not have to be Python.

So we need a new system to organize the crawling so we can know the status. So we know if certain RSS feeds had errors and we were not able to update. Then we need to find out the reason. If the reason is acceptable then we can make not of that in the database.

The updates are added to ElasticSearch and Cassandra.

Thank you

NoSQL 카우치 및 몽고 Python Elasticsearch 웹 스크랩핑 카산드라

프로젝트 ID: #26935223

프로젝트 소개

7 건(제안서) 재택 근무형 프로젝트 서비스 이용 중: 3년 전

이 일자리에 대한 프리랜서 7 명의 평균 입찰가: $193

lucas94work

Hi If your code is built by python, I think using the python crawling would be good. I would like to discuss more details with you. Kind regards Lucas

$165 USD (7일 이내)
(8 리뷰)
3.2
hemsingh1

Hi Steven S., NoSQL Couch & Mongo, Python, Elasticsearch, Web Scraping, Cassandra skilled professional with over 5+ years of experience. I have worked on several similar projects and can deliver quality solution to tig 기타

$195 USD (3일 이내)
(1 리뷰)
2.0
lukyaanton

Hi, Manager! I hope you are safe without the effect of covid-19. I have read your requirements carefully. I am scraping expert by using python. And I'm an experienced programmer with over 7 years of experience. Please 기타

$100 USD (2일 이내)
(2 리뷰)
1.3
nizamfarhas

Hello I checked your post with title "Update Script that Crawls iTunes and Inserts Data into Cassandra". I am familiar to python. I want to discuss your project in detail. please contact me Best regards

$185 USD (4일 이내)
(0 리뷰)
0.0