SparkR :: gapply How to use LinearRegression across groups in DataFrame?

종료 등록 시간: 2년 전 착불
종료

Hi there

I have big data which I am using for applying linear model to each group. I have small example of the data for the principle I want to have parallelised.

# Determine six waiting times with the largest eruption time in minutes.

schema <- structType(structField("waiting", "double"), structField("max_eruption", "double"))

result <- gapply(

df,

"waiting",

function(key, x) {

y <- [login to view URL](key, max(x$eruptions))

},

schema)

head(collect(arrange(result, "max_eruption", decreasing = TRUE)))

데이터 마이닝 R 프로그래밍 언어

프로젝트 ID: #30580205

프로젝트 소개

4 건(제안서) 재택 근무형 프로젝트 서비스 이용 중: 2년 전

이 일자리에 대한 프리랜서 4 명의 평균 입찰가: €10 (1시간 기준)

Annmarie1995

Hi I am a professional statistician with 5 years of experience. I have read the job description. I will help you complete the project. i have skills in Data Mining and R Programming Language. I can deliver quality an 기타

€16 EUR / (1시간 기준)
(23 리뷰)
4.9
WycOj

EXPERT IN STATISTICS Hello there, I am best in statistics, R programming analysis of data, SPSS, Statistical/Data Analysis, Multivariate Statistical Analysis, Regression Analysis, STATA, MINITAB, R language, Factor Ana 기타

€10 EUR / (1시간 기준)
(19 리뷰)
4.4
ibahimakerkouch

Hi, I have a big experience on R programming also I am a master's degree in data science. You can see my profile and my reviews to prove to you that I worked well on R projects. Your project is a challenge for me. Le 기타

€4 EUR / (1시간 기준)
(20 리뷰)
4.3
StatisticandArt

Hi, I graduated Bachelor of Statistics. I have experience using R because that application have been learned when i was college. I am also a specialist in Basic Statistical Analysis (descriptive analysis, graph, chart 기타

€8 EUR / (1시간 기준)
(10 리뷰)
3.2