Find Jobs
Hire Freelancers

Text Analysis Project using PySpark ML

$30-250 USD

종료됨
게시됨 5년 이상 전

$30-250 USD

제출할때 지불됩니다
I want someone to do a theme analyses around 5 million comments on a video sharing website using PySpark Ml library as the main tool. I will provide the dataset. The work environment should be Databricks Community Edition (you can create an account for free), and the deliverable is a Databricks notebook. The data is at “video_creator – commentor_id – comment” granularity. What I want you to do is the following: 1. Remove comments that are not written in English. 2. For each commentor_id, append all his/her comments into one feature, call it “all_comments”. That is, aggregate the granularity of dataset into commentor_id – all_comments granularity 3. Transform the “all_comments” feature using Word2Vec modules of PySpark ML library (not the MlLib library as I want to do everything using dataframes) 4. Do a clustering of the transformed “all_comments” feature using the LDA module of PySpark ML. 5. Generate the most frequent words for each cluster as identified in field. I will do the interpretation of the results, and you don’t need to worry about it. So overall, it’s a straightforward task of data clean, aggregation, and application of standard PySpark ML modules. I estimate this project to take 2 to 3 hours of programming for someone good at Python and PySpark. I hope to get the project done in 3 days, up to 6 days is acceptable. If you place your bid, I will share with you the link to the data file. I don't have other instructions other than those five steps listed above.
프로젝트 ID: 17903811

프로젝트 정보

7 제안서
원격근무 프로젝트
활동 중 5년 전

돈을 좀 벌 생각이십니까?

프리랜서 입찰의 이점

예산 및 기간 설정
작업 결과에 대한 급여 수급
제안의 개요를 자세히 쓰세요
무료로 프로젝트에 신청하고 입찰할 수 있습니다
7 이 프로젝트에 프리랜서들의 평균 입찰은 $271 USD입니다.
사용자 아바타
I have a good hands on working with Advanced R and Python and BI tools and technologies, AI, Big Data. I have quite a good knowledge of DL/ML Algorithm , have also developed Dashboards and Web Application. My area of expertise is building financial models (Stock Markets) , Image Processing and building models for food, healthcare and telecom sector, Classification/Prediction/Clustering, NLP and Chatbots. I understand the project requirement and will deliver the desired product within the time specified. I would like to hear from you. Thanks Shivam
$250 USD 3일에
4.7 (66 건의 리뷰)
7.0
7.0
사용자 아바타
Hello! I am a python developer. I looked at your project and it seems interesting. I have all necessary skills required for this project. Ping me to discuss in detail.
$140 USD 2일에
4.7 (34 건의 리뷰)
5.6
5.6
사용자 아바타
Hi I am a very experienced statistician, data scientist and academic writer. I have completed several PhD level thesis projects involving advanced statistical analysis of data. I have worked with data from several companies and have done projects involving high level quantitative analysis and data interpretation skills to study the trends, time behaviour and compare the variables in the data. I can do advanced level analysis in SPSS, R, PYTHON, WEKA, TABLEAU and EXCEL tools like machine learning, hypothesis testing, forecasting, T-test, ANOVA etc. Looking forward to discussion, Best Regards, Suyash
$500 USD 3일에
4.0 (31 건의 리뷰)
5.9
5.9
사용자 아바타
do kindly let's discuss over chat
$222 USD 6일에
4.5 (34 건의 리뷰)
4.8
4.8
사용자 아바타
Hello? I have read your job description carefully. I have python experienced for 7 years. I want to discuss with you via chat. Thanks you, James.
$155 USD 3일에
5.0 (3 건의 리뷰)
2.4
2.4
사용자 아바타
I have been working as data scientist for more than 4 years during which i implemented numerous machine learning algorithms to solve varied business problems. Moreover, to gain other domain expertise, i have been actively participating in online data science contests, out of which i won few as well.
$388 USD 7일에
5.0 (2 건의 리뷰)
2.2
2.2

고객에 대한 정보

국기 (UNITED STATES)
Durham, United States
4.4
36
결제 수단 확인
6월 12, 2013부터 회원입니다

고객 확인

감사합니다! 무료 크레딧을 신청할 수 있는 링크를 이메일로 보내드렸습니다.
이메일을 보내는 동안 문제가 발생했습니다. 다시 시도해 주세요.
등록 사용자 전체 등록 건수(일자리)
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
미리 보기 화면을 준비 중...
위치 정보 관련 접근권이 허용되었습니다.
고객님의 로그인 세션이 만료되어, 자동으로 로그아웃 처리가 되었습니다. 다시 로그인하여 주십시오.