启动 spark shell 后,我们可以导入库,并创建 Hudi 表,如下所示 。
Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.8 /_/Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_312)Type in expressions to have them evaluated.Type :help for more information.scala> spark.sql("""create table hudi_product_catalog ( | seller_id int, | prod_category string, | product_name string, | product_package string, | discount_percentage string, | eff_start_ts timestamp, | eff_end_ts timestamp, | actv_ind int | ) using hudi | tblproperties ( | type = 'cow', | primaryKey = 'seller_id,prod_category,eff_end_ts', | preCombineField = 'eff_start_ts' | ) | partitioned by (actv_ind) | location 'gs://target_bucket/hudi_product_catalog/'""")将数据写入到存储桶后,如下是 Hudi 目标表的数据格式 。
+-------------------+---------------------+-------------------------------------------------------------------------+----------------------+--------------------------------------------------------------------------+---------+--------------+---------------+---------------+-------------------+-------------------+-------------------+--------+|_hoodie_commit_time|_hoodie_commit_seqno |_hoodie_record_key |_hoodie_partition_path|_hoodie_file_name |seller_id|prod_category |product_name |product_package|discount_percentage|eff_start_ts |eff_end_ts |actv_ind|+-------------------+---------------------+-------------------------------------------------------------------------+----------------------+--------------------------------------------------------------------------+---------+--------------+---------------+---------------+-------------------+-------------------+-------------------+--------+|20220722113258101 |20220722113258101_0_0|seller_id:3412,prod_category:Healthcare,eff_end_ts:253402300799000000 |actv_ind=1 |a94c9c58-ac6b-4841-a734-8ef1580e2547-0_0-29-1219_20220722113258101.parquet|3412 |Healthcare |Dolo 650 |10 |10 |2022-04-01 16:30:45|9999-12-31 23:59:59|1 ||20220722113258101 |20220722113258101_0_1|seller_id:1234,prod_category:Home Essential,eff_end_ts:253402300799000000|actv_ind=1 |a94c9c58-ac6b-4841-a734-8ef1580e2547-0_0-29-1219_20220722113258101.parquet|1234 |Home Essential|Hand Towel |12 |20 |2021-10-20 06:55:22|9999-12-31 23:59:59|1 ||20220722113258101 |20220722113258101_0_2|seller_id:4565,prod_category:Gourmet,eff_end_ts:253402300799000000 |actv_ind=1 |a94c9c58-ac6b-4841-a734-8ef1580e2547-0_0-29-1219_20220722113258101.parquet|4565 |Gourmet |Dairy Milk Silk|6 |30 |2021-06-12 20:30:40|9999-12-31 23:59:59|1 ||20220722113258101 |20220722113258101_0_3|seller_id:1234,prod_category:Detergent,eff_end_ts:253402300799000000 |actv_ind=1 |a94c9c58-ac6b-4841-a734-8ef1580e2547-0_0-29-1219_20220722113258101.parquet|1234 |Detergent |Tide 2L |6 |15 |2021-12-15 15:20:30|9999-12-31 23:59:59|1 |+-------------------+---------------------+-------------------------------------------------------------------------+----------------------+--------------------------------------------------------------------------+---------+--------------+---------------+---------------+-------------------+-------------------+-------------------+--------+
经验总结扩展阅读
- iptables使用详解
- 华为车载智慧屏值得买吗_华为车载智慧屏使用评测
- 夏天使用空调怎么更省电 空调不制冷解决方法
- Pytest进阶使用
- 法国大宝能当晚霜使用吗?
- 如何使用 pyqt 读取串口传输的图像
- 你们觉得华为手机卡不卡,使用体验如何(华为加装nm卡缺点)
- 飞机上手机可以开机吗 飞机上手机可以开机正常使用吗
- 古墓丽影10怎么打飞机(古墓丽影10怎么使用榴弹)
- 水乳霜眼霜的使用顺序是怎么样的?