Skip to content

sansha94/PySpark-Cheatsheet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PySpark Cheatsheet

Cheatsheet for pandas 🐼 lovers.

While creating this cheatsheet I have considered the time series data of a wind turbine.

Dataset Information

Context of the datset

The collected dataset is for one of the wind turbine installed in a farm located in Turkey and has the granularity of 10 mins.

Data Dictionary

column description
Date/Time 10 mins time interval
LV ActivePower (kW) power generated by the turbine at that time
Wind Speed (m/s) wind speed at the hub height of the turbine
TheoreticalPowerCurve (KWh) theoretical power values that the turbine generates with that wind speed which is given by the turbine manufacturer
Wind Direction (°) wind direction at the hub height of the turbine (wind turbines turn to this direction automatically)

Source: Kaggle

Note for the enthusiasts

If you find any bug 🐞 or typo ⌨️ feel free 🙂 to raise the issue 🎫.