Skip to content

chengkaiyang2025/BigData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

In the past few years of working experience, I have learned a lot of knowledge about big data, and I believe it is very important to record what I have learned.

The code in this repository includes the use of the following big data tools:

  1. Examples of using various layers of Flink API, including how to use Flink to develop real-time log parsing programs, etc.
  2. Examples of writing Hadoop MapReduce programs, including how to write a MapReduce program to parse offline logs.
  3. Examples of using HBase API, mainly including an analysis of the read and write optimization process for HBase.
  4. Writing Hive SQL, including partition functions, lag functions, etc.
  5. Spark and Scala, how to use Spark RDD, Spark DataFrame, Spark SQL for data analysis.

Please refer to the Readme.md under the documentation for the running directions of each code.

Currently, some code comments are in Chinese, and I will unify the comments into English later.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published