diff --git a/docs/tutorial/Advanced-Tutorial-Tune-your-Application.zh.md b/docs/tutorial/Advanced-Tutorial-Tune-your-Application.zh.md
new file mode 100644
index 00000000000..e22c5b1794c
--- /dev/null
+++ b/docs/tutorial/Advanced-Tutorial-Tune-your-Application.zh.md
@@ -0,0 +1,86 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 进阶教程：调优您的 Sedona RDD 应用
+
+在进入这篇进阶教程之前，请确保您已经在本机尝试过若干 Sedona 函数。
+
+## 选择合适的 Sedona 版本
+
+Sedona 的版本号包含三级（例如 0.8.1）。
+
+第一级表示该版本进行了较大的结构重设计，可能带来显著的 API 变化与性能差异。
+
+第二级（如 0.8）表明该版本包含显著的性能提升、重要的新功能以及 API 变化。如果您是 Sedona 老用户并希望升级到这种版本，需要谨慎对待 API 变更。升级前请阅读 [Sedona 版本发布说明](../setup/release-notes.md)，确认能接受相应的 API 变化。
+
+第三级（如 0.8.1）则只包含 bug 修复、少量小的新特性以及轻微的性能提升，不会包含任何 API 变化。升级到此类版本是安全的。我们强烈建议同一二级版本下的所有 Sedona 用户都升级到该级别的最新版本。
+
+## 选择合适的 Spatial RDD 构造方式
+
+Sedona 为每种 SpatialRDD（PointRDD、PolygonRDD、LineStringRDD）提供了多种构造方式。一般来说，您可以从两种入口开始：
+
+1. 从 HDFS、S3 等数据源初始化 SpatialRDD。一个典型示例如下：
+
+```java
+public PointRDD(JavaSparkContext sparkContext, String InputLocation, Integer Offset, FileDataSplitter splitter, boolean carryInputData, Integer partitions, StorageLevel newLevel)
+```
+
+2. 从已有 RDD 初始化 SpatialRDD。一个典型示例如下：
+
+```java
+public PointRDD(JavaRDD<Point> rawSpatialRDD, StorageLevel newLevel)
+```
+
+可以注意到这些构造函数都接受一个 `StorageLevel` 参数。这是为了让 Spark 缓存 SpatialRDD 的一个属性 `rawSpatialRDD`。Sedona 这样做是因为它需要通过若干 Spark “Action” 计算数据集边界与近似总数；这些信息在执行 Spatial Join Query 与 Distance Join Query 时非常有用。
+
+但有时您对自己的数据集十分了解，那么可以手动提供这些信息，调用如下形式的 Spatial RDD 构造函数：
+
+```java
+public PointRDD(JavaSparkContext sparkContext, String InputLocation, Integer Offset, FileDataSplitter splitter, boolean carryInputData, Integer partitions, Envelope datasetBoundary, Integer approximateTotalCount) {
+```
+
+手动提供数据集边界与近似总数能让 Sedona 在初始化时跳过若干较慢的 “Action”。
+
+## 缓存被反复使用的 Spatial RDD
+
+每个 SpatialRDD（PointRDD、PolygonRDD、LineStringRDD）都包含 4 个 RDD 属性：
+
+1. rawSpatialRDD：由 SpatialRDD 构造方法生成的 RDD。
+2. spatialPartitionedRDD：基于 rawSpatialRDD 进行空间分区后的 RDD。注意：该 RDD 中存在被复制的空间对象。
+3. indexedRawRDD：基于 rawSpatialRDD 建索引后的 RDD。
+4. indexedRDD：基于 spatialPartitionedRDD 建索引后的 RDD。注意：该 RDD 中存在被复制的空间对象。
+
+这 4 个 RDD 不会同时存在，所以无需担心内存问题。
+它们在不同查询中分别被调用：
+
+1. Spatial Range Query / KNN Query，未启用索引：使用 rawSpatialRDD。
+2. Spatial Range Query / KNN Query，启用索引：使用 indexedRawRDD。
+3. Spatial Join Query / Distance Join Query，未启用索引：使用 spatialPartitionedRDD。
+4. Spatial Join Query / Distance Join Query，启用索引：使用 indexedRDD。
+
+因此，如果您会多次执行上述某种查询，最好将对应的 RDD 缓存到内存中。常见的使用场景包括：
+
+1. 在 Spatial Autocorrelation、Spatial Co-location Pattern Mining 等空间数据挖掘任务中，可能需要迭代地执行 Spatial Join / Spatial Self-join 来计算邻接矩阵。这种情况下请缓存被反复查询的 spatialPartitionedRDD/indexedRDD。
+2. 在 [Livy](https://github.com/cloudera/livy)、[Spark Job Server](https://github.com/spark-jobserver/spark-jobserver) 等 Spark RDD 共享应用中，多名用户可能在同一份 Spatial RDD 上以不同谓词执行 Spatial Range Query / KNN Query，此时建议缓存 rawSpatialRDD/indexedRawRDD。
+
+## 留意 Spatial RDD 的分区数
+
+有时用户反映某些场景下执行时间较慢。第一步请始终考虑增加 SpatialRDD 的分区数（建议为原值的 2 - 8 倍），可以在初始化 SpatialRDD 时进行设置，这往往能显著提升性能。
+
+之后您可以再考虑调整 Spark 的其他参数，例如使用 Kryo 序列化器，或调整缓存到内存的 RDD 比例。
diff --git a/docs/tutorial/benchmark.zh.md b/docs/tutorial/benchmark.zh.md
new file mode 100644
index 00000000000..eab59e825eb
--- /dev/null
+++ b/docs/tutorial/benchmark.zh.md
@@ -0,0 +1,25 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+## 基准测试
+
+我们欢迎大家将 Sedona 用于基准测试。为了获得最佳性能或体验 Sedona 的全部特性，请：
+
+* 始终使用最新版本，或者在基准测试中明确说明所用版本，便于我们追踪相关问题。
+* 启用 Sedona 的 Kryo 序列化器以减少内存占用。
diff --git a/docs/tutorial/concepts/clustering-algorithms.zh.md b/docs/tutorial/concepts/clustering-algorithms.zh.md
new file mode 100644
index 00000000000..06bdb2d9187
--- /dev/null
+++ b/docs/tutorial/concepts/clustering-algorithms.zh.md
@@ -0,0 +1,136 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 在 Apache Spark 上使用 Apache Sedona 进行聚类
+
+聚类算法将相似的数据点划分到 “簇（cluster）” 中。Apache Sedona 可以在大规模几何数据集上运行聚类算法。
+
+注意 “cluster” 一词在此处有两种含义：
+
+* 计算集群（computation cluster）是一组协同执行算法的计算机网络
+* 聚类算法把数据点划分到不同的 “簇（cluster）” 中
+
+本页中的 “簇” 指聚类算法的输出结果。
+
+## 在 Spark 上使用 DBSCAN 进行聚类
+
+本页介绍如何使用 Apache Sedona 执行基于密度的带噪声空间聚类（DBSCAN，density-based spatial clustering of applications with noise）。
+
+DBSCAN 将密度较高区域中的几何对象聚为簇，同时把密度较低区域中的点标记为噪声/离群点。
+
+下面通过散点图来观察一份可被聚类的数据：
+
+![点的散点图](../../image/tutorial/concepts/dbscan-scatterplot-points.png)
+
+DBSCAN 的聚类结果如下：
+
+![带簇分组的散点图](../../image/tutorial/concepts/dbscan-clustering.png)
+
+* 簇 0 包含 5 个点
+* 簇 1 包含 4 个点
+* 4 个点为离群点
+
+下面使用这份数据创建 Spark DataFrame，并使用 Sedona 运行聚类。构造 DataFrame 的代码如下：
+
+```python
+df = (
+    sedona.createDataFrame(
+        [
+            (1, 8.0, 2.0),
+            (2, 2.6, 4.0),
+            (3, 2.5, 4.0),
+            (4, 8.5, 2.5),
+            (5, 2.8, 4.3),
+            (6, 12.8, 4.5),
+            (7, 2.5, 4.2),
+            (8, 8.2, 2.5),
+            (9, 8.0, 3.0),
+            (10, 1.0, 5.0),
+            (11, 8.0, 2.5),
+            (12, 5.0, 6.0),
+            (13, 4.0, 3.0),
+        ],
+        ["id", "x", "y"],
+    )
+).withColumn("point", ST_Point(col("x"), col("y")))
+```
+
+DataFrame 内容如下：
+
+```
++---+----+---+----------------+
+| id|   x|  y|           point|
++---+----+---+----------------+
+|  1| 8.0|2.0|     POINT (8 2)|
+|  2| 2.6|4.0|   POINT (2.6 4)|
+|  3| 2.5|4.0|   POINT (2.5 4)|
+|  4| 8.5|2.5| POINT (8.5 2.5)|
+|  5| 2.8|4.3| POINT (2.8 4.3)|
+|  6|12.8|4.5|POINT (12.8 4.5)|
+|  7| 2.5|4.2| POINT (2.5 4.2)|
+|  8| 8.2|2.5| POINT (8.2 2.5)|
+|  9| 8.0|3.0|     POINT (8 3)|
+| 10| 1.0|5.0|     POINT (1 5)|
+| 11| 8.0|2.5|   POINT (8 2.5)|
+| 12| 5.0|6.0|     POINT (5 6)|
+| 13| 4.0|3.0|     POINT (4 3)|
++---+----+---+----------------+
+```
+
+运行 DBSCAN 算法的方法如下：
+
+```python
+from sedona.spark.stats import dbscan
+
+dbscan(df, 1.0, 3).orderBy("id").show()
+```
+
+计算结果如下：
+
+```
++---+----+---+----------------+------+-------+
+| id|   x|  y|           point|isCore|cluster|
++---+----+---+----------------+------+-------+
+|  1| 8.0|2.0|     POINT (8 2)|  true|      0|
+|  2| 2.6|4.0|   POINT (2.6 4)|  true|      1|
+|  3| 2.5|4.0|   POINT (2.5 4)|  true|      1|
+|  4| 8.5|2.5| POINT (8.5 2.5)|  true|      0|
+|  5| 2.8|4.3| POINT (2.8 4.3)|  true|      1|
+|  6|12.8|4.5|POINT (12.8 4.5)| false|     -1|
+|  7| 2.5|4.2| POINT (2.5 4.2)|  true|      1|
+|  8| 8.2|2.5| POINT (8.2 2.5)|  true|      0|
+|  9| 8.0|3.0|     POINT (8 3)|  true|      0|
+| 10| 1.0|5.0|     POINT (1 5)| false|     -1|
+| 11| 8.0|2.5|   POINT (8 2.5)|  true|      0|
+| 12| 5.0|6.0|     POINT (5 6)| false|     -1|
+| 13| 4.0|3.0|     POINT (4 3)| false|     -1|
++---+----+---+----------------+------+-------+
+```
+
+可以看到 `cluster` 列表示每个几何对象所属的簇。
+
+要执行该操作，必须先设置 Spark 的检查点目录。检查点目录是查询中间结果写入的持久化临时缓存位置。
+
+可按如下方式设置检查点目录：
+
+```python
+sedona.sparkContext.setCheckpointDir(myPath)
+```
+
+`myPath` 必须能被所有 executor 访问。本机运行时使用本地路径即可；如有 HDFS，通常是更好的选择。某些运行时环境可能允许或要求使用块存储路径（如 Amazon S3、Google Cloud Storage）。部分环境可能已经预先设置了 Spark 检查点目录，这一步即可省略。
diff --git a/docs/tutorial/concepts/distance-spark.zh.md b/docs/tutorial/concepts/distance-spark.zh.md
new file mode 100644
index 00000000000..a2408d99ef7
--- /dev/null
+++ b/docs/tutorial/concepts/distance-spark.zh.md
@@ -0,0 +1,275 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 在 Apache Spark 上使用 Sedona 计算距离
+
+本文介绍如何使用 Apache Sedona 与 Apache Spark 计算两个点或几何对象之间的距离。
+
+您将了解如何在二维笛卡尔平面上计算距离，以及如何为地理空间数据计算考虑地球曲率的距离。
+
+先看一个在二维笛卡尔平面上计算两点距离的示例。
+
+## 使用 Spark 与 Sedona 计算两点之间的距离
+
+假设您有 4 个点，希望分别计算 `point_a` 与 `point_b`、`point_c` 与 `point_d` 之间的距离。
+
+![两点之间的距离](../../image/tutorial/concepts/distance1.png)
+
+先创建一个包含这些点的 DataFrame：
+
+```python
+df = sedona.createDataFrame(
+    [
+        (Point(2, 3), Point(6, 4)),
+        (Point(6, 2), Point(9, 2)),
+    ],
+    ["start", "end"],
+)
+```
+
+`start` 与 `end` 列均为 `geometry` 类型。
+
+使用 `ST_Distance` 函数计算两点之间的距离：
+
+```python
+df.withColumn("distance", ST_Distance(col("start"), col("end"))).show()
+```
+
+结果如下：
+
+```
++-----------+-----------+-----------------+
+|      start|        end|         distance|
++-----------+-----------+-----------------+
+|POINT (2 3)|POINT (6 4)|4.123105625617661|
+|POINT (6 2)|POINT (9 2)|              3.0|
++-----------+-----------+-----------------+
+```
+
+借助 `ST_Distance`，在二维平面上计算两点之间的距离非常直观。
+
+## 使用 Spark 与 Sedona 计算两个经纬度点之间的距离
+
+下面创建两个经纬度点并计算它们之间的距离。先用经纬度构造 DataFrame：
+
+```python
+seattle = Point(-122.335167, 47.608013)
+new_york = Point(-73.935242, 40.730610)
+sydney = Point(151.2, -33.9)
+df = sedona.createDataFrame(
+    [
+        (seattle, new_york),
+        (seattle, sydney),
+    ],
+    ["place1", "place2"],
+)
+```
+
+计算这些点之间的距离：
+
+```python
+df.withColumn(
+    "st_distance_sphere", ST_DistanceSphere(col("place1"), col("place2"))
+).show()
+```
+
+结果如下：
+
+```
++--------------------+--------------------+--------------------+
+|              place1|              place2|  st_distance_sphere|
++--------------------+--------------------+--------------------+
+|POINT (-122.33516...|POINT (-73.935242...|  3870075.7867602874|
+|POINT (-122.33516...| POINT (151.2 -33.9)|1.2473172370818963E7|
++--------------------+--------------------+--------------------+
+```
+
+我们使用 `ST_DistanceSphere` 计算距离，它会考虑地球的曲率，返回值的单位为米。
+
+下面看如何使用椭球（spheroid）模型计算两点之间的距离。
+
+## 使用 Spark 与 Sedona 在椭球模型上计算两点距离
+
+复用前一节的 DataFrame，但改用椭球模型：
+
+```python
+res = df.withColumn(
+    "st_distance_spheroid", ST_DistanceSpheroid(col("place1"), col("place2"))
+)
+res.select("place1_name", "place2_name", "st_distance_spheroid").show()
+```
+
+结果如下：
+
+```
++-----------+-----------+--------------------+
+|place1_name|place2_name|st_distance_spheroid|
++-----------+-----------+--------------------+
+|    seattle|   new_york|  3880173.4858397646|
+|    seattle|     sydney|1.2456531875384018E7|
++-----------+-----------+--------------------+
+```
+
+`ST_DistanceSpheroid` 返回两地之间的距离（单位：米）。在椭球模型下计算的结果与把地球建模为球时差别不大，但椭球的结果会略微更精确。
+
+## 使用 Spark 与 Sedona 计算两个几何对象之间的距离
+
+下面看看如何计算一条折线与一个多边形之间的距离。假设有以下对象：
+
+![对象之间的距离](../../image/tutorial/concepts/distance2.png)
+
+两个多边形之间的距离定义为它们之间任意两点的最小欧氏距离。
+
+计算距离：
+
+```python
+res = df.withColumn("distance", ST_Distance(col("geom1"), col("geom2")))
+```
+
+结果如下：
+
+```
++---+---+--------+
+|id1|id2|distance|
++---+---+--------+
+|a  |b  |2.0     |
++---+---+--------+
+```
+
+可以从图中直观地看出两个多边形之间的最小距离。
+
+## 三维最小笛卡尔距离
+
+下面看如何在计算两点距离时把高度（elevation）也考虑进来。
+
+我们将比较站在帝国大厦顶端的人与站在海平面的人之间的距离。
+
+构造 DataFrame：
+
+```python
+empire_state_ground = Point(-73.9857, 40.7484, 0)
+empire_state_top = Point(-73.9857, 40.7484, 380)
+df = sedona.createDataFrame(
+    [
+        (empire_state_ground, empire_state_top),
+    ],
+    ["point_a", "point_b"],
+)
+```
+
+分别计算 2D 与 3D 距离：
+
+```python
+res = df.withColumn("distance", ST_Distance(col("point_a"), col("point_b"))).withColumn(
+    "3d_distance", ST_3DDistance(col("point_a"), col("point_b"))
+)
+```
+
+结果如下：
+
+```
++--------------------+--------------------+--------+-----------+
+|             point_a|             point_b|distance|3d_distance|
++--------------------+--------------------+--------+-----------+
+|POINT (-73.9857 4...|POINT (-73.9857 4...|     0.0|      380.0|
++--------------------+--------------------+--------+-----------+
+```
+
+`ST_Distance` 不考虑高度；`ST_3DDistance` 会把高度纳入计算。
+
+## 使用 Spark 与 Sedona 计算 Frechet 距离
+
+构造一个包含以下折线的 Sedona DataFrame：
+
+![Frechet 距离](../../image/tutorial/concepts/distance3.png)
+
+```python
+a = LineString([(1, 1), (1, 3), (2, 4)])
+b = LineString([(1.1, 1), (1.1, 3), (3, 4)])
+c = LineString([(7, 1), (7, 3), (6, 4)])
+df = sedona.createDataFrame(
+    [
+        (a, "a", b, "b"),
+        (a, "a", c, "c"),
+    ],
+    ["geometry1", "geometry1_id", "geometry2", "geometry2_id"],
+)
+```
+
+计算 Frechet 距离：
+
+```python
+res = df.withColumn(
+    "frechet_distance", ST_FrechetDistance(col("geometry1"), col("geometry2"))
+)
+```
+
+查看结果：
+
+```python
+res.select("geometry1_id", "geometry2_id", "frechet_distance").show()
+```
+
+```
++------------+------------+----------------+
+|geometry1_id|geometry2_id|frechet_distance|
++------------+------------+----------------+
+|           a|           b|             1.0|
+|           a|           c|             6.0|
++------------+------------+----------------+
+```
+
+下图直观展示了这些距离，便于理解算法：
+
+![Frechet 距离](../../image/tutorial/concepts/distance4.png)
+
+## 使用 Spark 与 Sedona 计算几何对象之间的最大距离
+
+假设有以下几何对象：
+
+![几何对象之间的距离](../../image/tutorial/concepts/distance5.png)
+
+计算其中一些几何对象之间的最大距离：
+
+```python
+res = df.withColumn("max_distance", ST_MaxDistance(col("geom1"), col("geom2")))
+```
+
+查看结果：
+
+```python
+res.select("id1", "id2", "max_distance").show(truncate=False)
+```
+
+```
++---+---+-----------------+
+|id1|id2|max_distance     |
++---+---+-----------------+
+|a  |b  |8.246211251235321|
+|a  |c  |7.615773105863909|
++---+---+-----------------+
+```
+
+由此可以方便地获得两个几何对象之间的最大距离。
+
+## 结论
+
+Sedona 支持多种类型的距离计算，包括基于不同地球模型的距离，以及考虑高度等更复杂的距离运算。
+
+请根据您的分析需求选择最合适的距离函数。
diff --git a/docs/tutorial/demo.zh.md b/docs/tutorial/demo.zh.md
new file mode 100644
index 00000000000..8f6824667fd
--- /dev/null
+++ b/docs/tutorial/demo.zh.md
@@ -0,0 +1,62 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# Scala 与 Java 示例
+
+[Scala 与 Java 示例](https://github.com/apache/sedona/tree/master/examples) 中包含了 Sedona Spark（RDD、SQL 与 Viz）以及 Sedona Flink 的模板项目，且这些模板项目已经做好了适当的配置。
+
+注意：尽管这些模板项目使用 Scala 编写，相同的 API 在 Java 中同样适用。
+
+## 目录结构
+
+仓库的目录结构如下：
+
+* spark-sql：Scala 模板，演示如何使用 Sedona 的 RDD、DataFrame 与 SQL API
+* flink-sql：Java 模板，演示如何通过 Flink Table API 使用 Sedona SQL
+
+## 编译与打包
+
+### 前置条件
+
+请确保您的本机已安装以下软件：
+
+* Scala 项目：Scala 2.12
+* Java 项目：JDK 1.8、Apache Maven 3
+
+### 编译
+
+在每个模板项目目录下运行 `mvn clean package`。
+
+### 提交 fat jar 到 Spark
+
+执行上述命令后，可在 `./target` 目录中看到生成的 fat jar，使用 `./bin/spark-submit` 提交即可。
+
+如此提交需要做以下调整：
+
+* 修改模板项目中的 Spark Master 地址，或者直接删除该配置。当前模板硬编码为 `local[*]`，表示在本地以全部核心运行。
+* 将 Apache Spark 依赖的打包范围从 `compile` 改为 `provided`。这是 Maven 与 SBT 中常见的打包策略，意为不要把 Spark 一并打入 fat jar，否则可能导致 jar 体积过大并引发版本冲突。
+* 确保 build.sbt 中的依赖版本与您的 Spark 版本一致。
+
+## 在本地运行模板项目
+
+强烈建议使用 IDE 在本机运行模板项目。Scala 推荐使用安装了 Scala 插件的 IntelliJ IDEA；Java 推荐使用 IntelliJ IDEA 或 Eclipse。借助 IDE，**您无需做任何额外准备**（甚至不需要下载和搭建 Spark）！只要安装好 Scala 与 Java，一切都能正常运行。
+
+### Scala
+
+将 Scala 模板项目以 SBT 项目形式导入，然后运行其中的 Main 文件。
diff --git a/docs/tutorial/files/csv-geometry-sedona-spark.zh.md b/docs/tutorial/files/csv-geometry-sedona-spark.zh.md
new file mode 100644
index 00000000000..e3d8f931cd5
--- /dev/null
+++ b/docs/tutorial/files/csv-geometry-sedona-spark.zh.md
@@ -0,0 +1,192 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 使用 Spark 读写带几何对象的 Apache Sedona CSV
+
+本文展示如何使用 Sedona 与 Spark 读写带几何列的 CSV 文件。
+
+您将了解 CSV 在存储几何数据时的优缺点。
+
+先看如何写出带几何数据的 CSV。
+
+## 使用 Sedona 与 Spark 写出带几何对象的 CSV
+
+先用 Sedona 与 Spark 创建一个 DataFrame：
+
+```python
+df = sedona.createDataFrame(
+    [
+        ("a", "LINESTRING(2.0 5.0,6.0 1.0)"),
+        ("b", "POINT(1.0 2.0)"),
+        ("c", "POLYGON((7.0 1.0,7.0 3.0,9.0 3.0,7.0 1.0))"),
+    ],
+    ["id", "geometry"],
+)
+df = df.withColumn("geometry", ST_GeomFromText(col("geometry")))
+```
+
+DataFrame 的内容如下：
+
+```
++---+------------------------------+
+|id |geometry                      |
++---+------------------------------+
+|a  |LINESTRING (2 5, 6 1)         |
+|b  |POINT (1 2)                   |
+|c  |POLYGON ((7 1, 7 3, 9 3, 7 1))|
++---+------------------------------+
+```
+
+将 DataFrame 写入 CSV 文件：
+
+```python
+df = df.withColumn("geom_wkt", ST_AsText(col("geometry"))).drop("geometry")
+df.repartition(1).write.option("header", True).format("csv").mode("overwrite").save(
+    "/tmp/my_csvs"
+)
+```
+
+注意这里使用 `repartition(1)` 把 DataFrame 输出为单个文件。生产环境通常更建议并行输出多个文件以提升写入速度，这里仅为示例方便。
+
+CSV 文件内容如下：
+
+```
+id,geom_wkt
+a,"LINESTRING (2 5, 6 1)"
+b,POINT (1 2)
+c,"POLYGON ((7 1, 7 3, 9 3, 7 1))"
+```
+
+`geom_wkt` 列以纯文本保存，便于人工查看；它使用标准 WKT 格式，任何能解析 WKT 的引擎都能读取该列。
+
+## 使用 Sedona 与 Spark 读取带几何对象的 CSV
+
+将 CSV 文件读回 DataFrame：
+
+```python
+df = (
+    sedona.read.option("header", True)
+    .format("CSV")
+    .load("/tmp/my_csvs")
+    .withColumn("geometry", ST_GeomFromText(col("geom_wkt")))
+    .drop("geom_wkt")
+)
+```
+
+文件中的 `geom_wkt` 列保存为文本，因此读取时需用 `ST_GeomFromText` 转换为几何列。DataFrame 的内容如下：
+
+```
++---+------------------------------+
+|id |geometry                      |
++---+------------------------------+
+|a  |LINESTRING (2 5, 6 1)         |
+|b  |POINT (1 2)                   |
+|c  |POLYGON ((7 1, 7 3, 9 3, 7 1))|
++---+------------------------------+
+```
+
+确认 schema 正确：
+
+```
+root
+ |-- id: string (nullable = true)
+ |-- geometry: geometry (nullable = true)
+```
+
+## 使用 Extended Well-Known Text（EWKT）读写 CSV
+
+下面看如何把 DataFrame 以 EWKT 写入 CSV。先给几何列设置 SRID：
+
+```python
+df = df.withColumn("geometry", ST_SetSRID(col("geometry"), 4326))
+```
+
+然后将 DataFrame 以 EWKT 列写出：
+
+```python
+df = df.withColumn("geom_ewkt", ST_AsEWKT(col("geometry"))).drop("geometry")
+df.repartition(1).write.option("header", True).format("csv").mode("overwrite").save(
+    "/tmp/my_ewkt_csvs"
+)
+```
+
+CSV 文件内容如下：
+
+```
+id,geom_ewkt
+a,"SRID=4326;LINESTRING (2 5, 6 1)"
+b,SRID=4326;POINT (1 2)
+c,"SRID=4326;POLYGON ((7 1, 7 3, 9 3, 7 1))"
+```
+
+读回带 EWKT 列的 CSV：
+
+```python
+df = (
+    sedona.read.option("header", True)
+    .format("csv")
+    .load("/tmp/my_ewkt_csvs")
+    .withColumn("geometry", ST_GeomFromEWKT(col("geom_ewkt")))
+    .drop("geom_ewkt")
+)
+```
+
+DataFrame 的内容如下：
+
+```
++---+------------------------------+
+|id |geometry                      |
++---+------------------------------+
+|a  |LINESTRING (2 5, 6 1)         |
+|b  |POINT (1 2)                   |
+|c  |POLYGON ((7 1, 7 3, 9 3, 7 1))|
++---+------------------------------+
+```
+
+打印 Sedona DataFrame 时不会显示 SRID，但该元数据已在内部保留。
+
+## CSV 用于几何数据的优势
+
+使用 CSV 存储几何数据有以下优势：
+
+* 大多数引擎都支持 CSV
+* 人工可读
+* 借助 “扩展” 格式（EWKT）可保存 CRS 信息
+* 标准经历了长期考验
+
+但 CSV 也有不少劣势。
+
+## CSV 用于几何数据的劣势
+
+将几何数据存储为 CSV 文件有以下劣势：
+
+* CSV 是行式存储，引擎在读取时无法只挑选个别列。列式格式支持的列裁剪是重要的性能特性。
+* 行式特性也使得 CSV 难以高效压缩。
+* CSV 不包含 schema，引擎要么进行 schema 推断，要么用户在读取时手动指定。schema 推断容易出错，手动指定又繁琐。
+* CSV 不存储 row group 元数据，无法跳过 row group。
+* CSV 不存储文件级元数据，无法跳过整个文件。
+* 即使保留了 SRID 元数据，也只能写在 CSV 的每一行上，由于 CSV 不支持文件级元数据，这会造成不必要的空间浪费。
+
+## 结论
+
+Spark 与 Sedona 支持以 CSV 存储几何数据，但通常较慢，建议仅在必要时使用。
+
+如果您要构建地理空间数据湖，GeoParquet 几乎总是更好的选择。
+
+如果您要构建地理空间数据湖仓（lakehouse），Iceberg 也是不错的选项。
diff --git a/docs/tutorial/files/geojson-sedona-spark.zh.md b/docs/tutorial/files/geojson-sedona-spark.zh.md
new file mode 100644
index 00000000000..2894674159e
--- /dev/null
+++ b/docs/tutorial/files/geojson-sedona-spark.zh.md
@@ -0,0 +1,248 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 在 Spark 上使用 Apache Sedona 处理 GeoJSON
+
+本文介绍如何使用 Apache Sedona 与 Spark 读写单行 GeoJSON 与多行 GeoJSON 文件。
+
+末尾给出 GeoJSON 在空间分析中的优缺点小结。
+
+GeoJSON 基于 JSON，支持以下类型：
+
+* Point
+* LineString
+* Polygon
+* MultiPoint
+* MultiLineString
+* MultiPolygon
+
+更多关于 [GeoJSON 格式的规范](https://datatracker.ietf.org/doc/html/rfc7946) 见此处。
+
+## 使用 Sedona 与 Spark 读取多行 GeoJSON
+
+读取多行 GeoJSON 的方式如下：
+
+```python
+df = (
+    sedona.read.format("geojson")
+    .option("multiLine", "true")
+    .load("data/multiline_geojson.json")
+    .selectExpr("explode(features) as features")
+    .select("features.*")
+    .withColumn("prop0", expr("properties['prop0']"))
+    .drop("properties")
+    .drop("type")
+)
+df.show(truncate=False)
+```
+
+输出如下：
+
+```
++---------------------------------------------+------+
+|geometry                                     |prop0 |
++---------------------------------------------+------+
+|POINT (102 0.5)                              |value0|
+|LINESTRING (102 0, 103 1, 104 0, 105 1)      |value1|
+|POLYGON ((100 0, 101 0, 101 1, 100 1, 100 0))|value2|
++---------------------------------------------+------+
+```
+
+该多行 GeoJSON 文件包含一个点、一个折线和一个多边形。原始文件内容如下：
+
+```json
+{ "type": "FeatureCollection",
+    "features": [
+      { "type": "Feature",
+        "geometry": {"type": "Point", "coordinates": [102.0, 0.5]},
+        "properties": {"prop0": "value0"}
+        },
+      { "type": "Feature",
+        "geometry": {
+          "type": "LineString",
+          "coordinates": [
+            [102.0, 0.0], [103.0, 1.0], [104.0, 0.0], [105.0, 1.0]
+            ]
+          },
+        "properties": {
+          "prop0": "value1",
+          "prop1": 0.0
+          }
+        },
+      { "type": "Feature",
+         "geometry": {
+           "type": "Polygon",
+           "coordinates": [
+             [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0],
+               [100.0, 1.0], [100.0, 0.0] ]
+             ]
+         },
+         "properties": {
+           "prop0": "value2",
+           "prop1": {"this": "that"}
+           }
+         }
+       ]
+}
+```
+
+注意整体结构是一个 `FeatureCollection`，每个 feature 都有几何类型、几何坐标与属性字段。
+
+也可以一次读取多个多行 GeoJSON 文件。假设有如下目录：
+
+```
+many_geojsons/
+  file1.json
+  file2.json
+```
+
+读取方法如下：
+
+```python
+df = (
+    sedona.read.format("geojson").option("multiLine", "true").load("data/many_geojsons")
+)
+```
+
+只需把包含 JSON 文件的目录传入即可。
+
+多行 GeoJSON 对人类阅读友好，但对机器低效。建议把 JSON 数据写在单行上。
+
+## 使用 Sedona 与 Spark 读取单行 GeoJSON
+
+读取单行 GeoJSON 的方式如下：
+
+```python
+df = (
+    sedona.read.format("geojson")
+    .load("data/singleline_geojson.json")
+    .withColumn("prop0", expr("properties['prop0']"))
+    .drop("properties")
+    .drop("type")
+)
+df.show(truncate=False)
+```
+
+结果如下：
+
+```
++---------------------------------------------+------+
+|geometry                                     |prop0 |
++---------------------------------------------+------+
+|POINT (102 0.5)                              |value0|
+|LINESTRING (102 0, 103 1, 104 0, 105 1)      |value1|
+|POLYGON ((100 0, 101 0, 101 1, 100 1, 100 0))|value2|
++---------------------------------------------+------+
+```
+
+数据如下：
+
+```
+{"type":"Feature","geometry":{"type":"Point","coordinates":[102.0,0.5]},"properties":{"prop0":"value0"}}
+{"type":"Feature","geometry":{"type":"LineString","coordinates":[[102.0,0.0],[103.0,1.0],[104.0,0.0],[105.0,1.0]]},"properties":{"prop0":"value1"}}
+{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[100.0,0.0],[101.0,0.0],[101.0,1.0],[100.0,1.0],[100.0,0.0]]]},"properties":{"prop0":"value2"}}
+```
+
+可以看出：多行 GeoJSON 使用一个 `FeatureCollection`，而单行 GeoJSON 中每行是独立的 `Feature`。
+
+单行 GeoJSON 文件更优——查询引擎可以对它进行 split。
+
+下面看如何用 Sedona 通过 DataFrame 写出 GeoJSON。
+
+## 使用 Sedona 与 Spark 写出 GeoJSON
+
+创建一个 Sedona DataFrame，再写出为 GeoJSON：
+
+```
+df = sedona.createDataFrame([
+    ("a", 'LINESTRING(2.0 5.0,6.0 1.0)'),
+    ("b", 'LINESTRING(7.0 4.0,9.0 2.0)'),
+    ("c", 'LINESTRING(1.0 3.0,3.0 1.0)'),
+], ["id", "geometry"])
+actual = df.withColumn("geometry", ST_GeomFromText(col("geometry")))
+actual.write.format("geojson").mode("overwrite").save("/tmp/a_thing")
+```
+
+写出的文件如下：
+
+```
+a_thing/
+  _SUCCESS
+  part-00000-856044c5-ae35-4306-bf7a-ae9c3cb25434-c000.json
+  part-00003-856044c5-ae35-4306-bf7a-ae9c3cb25434-c000.json
+  part-00007-856044c5-ae35-4306-bf7a-ae9c3cb25434-c000.json
+  part-00011-856044c5-ae35-4306-bf7a-ae9c3cb25434-c000.json
+```
+
+Sedona 会并行写出多个 GeoJSON 文件，比写单一文件更快。
+
+注意：写出操作要求 DataFrame 至少包含一个几何类型的列。Sedona 会按以下规则确定使用哪一列作为几何列：
+
+1. 如果存在名为 “geometry” 且类型为 geometry 的列，则使用该列；
+2. 否则使用根 schema 中找到的第一个几何列。
+
+也可以通过 `geometry.column` 选项手动指定使用哪一列：
+
+```python
+df.write.format("geojson").option("geometry.column", "geometry").save("/tmp/a_thing")
+```
+
+将这些 GeoJSON 文件再读回 DataFrame：
+
+```python
+df = sedona.read.format("geojson").load("/tmp/a_thing")
+df.show(truncate=False)
+```
+
+```
++---------------------+----------+-------+
+|geometry             |properties|type   |
++---------------------+----------+-------+
+|LINESTRING (1 3, 3 1)|{c}       |Feature|
+|LINESTRING (2 5, 6 1)|{a}       |Feature|
+|LINESTRING (7 4, 9 2)|{b}       |Feature|
++---------------------+----------+-------+
+```
+
+## GeoJSON 的优势
+
+GeoJSON 格式有以下优点：
+
+* 人工可读
+* 可以并行写出多个文件，便于并行处理引擎获得更快的 I/O。
+* 大量引擎都支持 GeoJSON / JSON 文件。
+
+不过 GeoJSON 也有不少缺点，使其在存储地理空间数据时并非最佳选择。
+
+## GeoJSON 的局限
+
+GeoJSON 在空间数据湖场景下可能存在以下性能问题：
+
+* GeoJSON 对象虽然可以包含 CRS，但 CRS 是可选的，这一关键信息可能丢失。
+* 行式存储，无法享受 GeoParquet 等列式格式的列裁剪等性能优化。
+* 不存储 row group 元数据，无法进行 row-group 过滤（这是 Parquet 的一项性能优化）。
+* 文件尾部不携带 schema，需手动提供或自动推断。
+* GeoJSON 规范要求固定的结构，对某些数据集而言比较僵硬。
+* 只能用于构建数据湖，无法用于构建数据湖仓（lakehouse）。
+
+## 结论
+
+GeoJSON 在空间数据分析中很常见，Apache Sedona 提供完整的读写支持是非常方便的。
+
+GeoJSON 受到广泛支持且可读性好，但相比 GeoParquet 等格式速度较慢。一般而言，进行空间数据分析时建议优先使用 GeoParquet 或 Iceberg 以获得更好的性能。
diff --git a/docs/tutorial/files/geopackage-sedona-spark.zh.md b/docs/tutorial/files/geopackage-sedona-spark.zh.md
new file mode 100644
index 00000000000..fe7cf48775a
--- /dev/null
+++ b/docs/tutorial/files/geopackage-sedona-spark.zh.md
@@ -0,0 +1,198 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 在 Spark 上使用 Apache Sedona 处理 GeoPackage
+
+本文介绍如何使用 Apache Sedona 与 Spark 读取 GeoPackage 文件。
+
+您将了解 GeoPackage 文件格式的优缺点，以及如何在生产环境中使用它。
+
+下面先创建一个 GeoPackage 文件，再演示读取。
+
+## 使用 Sedona 与 Spark 读取 GeoPackage 文件
+
+先创建一个包含若干行数据的 GeoPackage 文件。
+
+首先构造一个 GeoPandas DataFrame：
+
+```python
+point1 = Point(0, 0)
+point2 = Point(1, 1)
+polygon1 = Polygon([(5, 5), (6, 6), (7, 5), (6, 4)])
+
+data = {
+    "name": ["Point A", "Point B", "Polygon A"],
+    "value": [10, 20, 30],
+    "geometry": [point1, point2, polygon1],
+}
+gdf = gpd.GeoDataFrame(data, geometry="geometry")
+```
+
+将 GeoPandas DataFrame 写入 GeoPackage 文件：
+
+```python
+gdf.to_file("/tmp/my_file.gpkg", layer="my_layer", driver="GPKG")
+```
+
+代码中将 driver 设为 `GPKG`，因此 GeoPandas 会以 GeoPackage 格式写出。
+
+可以把 layer 视作表名。
+
+接下来用 Apache Sedona 与 Spark 读取这个 GeoPackage 文件：
+
+```python
+df = (
+    sedona.read.format("geopackage")
+    .option("tableName", "my_layer")
+    .load("/tmp/my_file.gpkg")
+)
+df.show()
+```
+
+DataFrame 内容如下：
+
+```
++---+--------------------+---------+-----+
+|fid|                geom|     name|value|
++---+--------------------+---------+-----+
+|  1|         POINT (0 0)|  Point A|   10|
+|  2|         POINT (1 1)|  Point B|   20|
+|  3|POLYGON ((5 5, 6 ...|Polygon A|   30|
++---+--------------------+---------+-----+
+```
+
+几何列可以包含点、多边形等多种几何对象。
+
+也可以查看 GeoPackage 文件的元数据：
+
+```python
+df = (
+    sedona.read.format("geopackage")
+    .option("showMetadata", "true")
+    .load("/tmp/my_file.gpkg")
+)
+df.show()
+```
+
+输出如下：
+
+```
++----------+---------+----------+-----------+--------------------+-----+-----+-----+-----+------+
+|table_name|data_type|identifier|description|         last_change|min_x|min_y|max_x|max_y|srs_id|
++----------+---------+----------+-----------+--------------------+-----+-----+-----+-----+------+
+|  my_layer| features|  my_layer|           |2025-02-25 06:28:...|  0.0|  0.0|  7.0|  6.0| 99999|
++----------+---------+----------+-----------+--------------------+-----+-----+-----+-----+------+
+```
+
+## 使用 Sedona 与 Spark 读取多个 GeoPackage 文件
+
+Sedona 也支持读取多个 GeoPackage 文件。假设有以下文件结构：
+
+```
+gpkgs/
+  my_file1.gpkg
+  my_file2.gpkg
+```
+
+可以这样读取所有文件：
+
+```python
+df = sedona.read.format("geopackage").option("tableName", "my_layer").load("/tmp/gpkgs")
+df.show()
+```
+
+结果如下：
+
+```
++---+--------------------+---------+-----+
+|fid|                geom|     name|value|
++---+--------------------+---------+-----+
+|  1|         POINT (5 5)|  Point C|   30|
+|  2|POLYGON ((5 5, 6 ...|Polygon A|   40|
+|  1|         POINT (0 0)|  Point A|   10|
+|  2|         POINT (1 1)|  Point B|   20|
++---+--------------------+---------+-----+
+```
+
+只需指定包含 GeoPackage 文件的目录，Sedona 即可将它们全部加载到一个 DataFrame 中。
+
+由于 Sedona 可以并行读取与处理这些文件，因此非常适合分析大量 GeoPackage 文件。
+
+## 加载 GeoPackage 中的栅格数据
+
+也可以从 GeoPackage 中的栅格表加载数据。代码如下：
+
+```python
+df = (
+    sedona.read.format("geopackage")
+    .option("tableName", "raster_table")
+    .load("/path/to/geopackage")
+)
+```
+
+DataFrame 内容如下：
+
+```
++---+----------+-----------+--------+--------------------+
+| id|zoom_level|tile_column|tile_row|           tile_data|
++---+----------+-----------+--------+--------------------+
+|  1|        11|        428|     778|GridCoverage2D["c...|
+|  2|        11|        429|     778|GridCoverage2D["c...|
+|  3|        11|        428|     779|GridCoverage2D["c...|
+|  4|        11|        429|     779|GridCoverage2D["c...|
+|  5|        11|        427|     777|GridCoverage2D["c...|
++---+----------+-----------+--------+--------------------+
+```
+
+已知限制（v1.7.0）：
+
+* 不支持 webp 栅格
+* 不支持 ewkb 几何
+* 不支持基于几何包络的过滤
+
+以上限制都将在后续版本中陆续解决，敬请关注！
+
+## GeoPackage 文件格式的优势
+
+GeoPackage 格式有许多优点：
+
+* 因为是开放格式，任何引擎都可以支持。
+* 与许多其他格式不同，它是可变的。
+* 不像某些格式，它会保存 CRS 信息。
+* 既可以存储矢量数据，也可以存储栅格数据。
+* GeoPandas、Sedona、SQLite 等众多引擎都能读取。
+
+但 GeoPackage 也存在不少不足。
+
+## GeoPackage 的劣势
+
+GeoPackage 文件格式有以下劣势：
+
+* 行式存储，无法享受列式格式的列裁剪优势。
+* 不支持多引擎并发事务。
+* 虽然支持 SQLite 事务，但跨引擎可靠事务很难实现。
+* 并非所有引擎都完整支持。
+
+## 结论
+
+如果您本身在使用 SQLite，GeoPackage 是非常稳健的格式选择。
+
+Sedona 能读取由 SQLite 分析生成的 GeoPackage 文件这一点非常有价值——可以并行读取这些文件，并对海量数据进行分析；同时也可以在集群上运行 Sedona。
+
+如果您还没有使用过 GeoPackage，那么使用 GeoParquet、Iceberg 这类格式通常会是更好的选择。
diff --git a/docs/tutorial/files/geotiffmetadata-sedona-spark.zh.md b/docs/tutorial/files/geotiffmetadata-sedona-spark.zh.md
new file mode 100644
index 00000000000..2d8603fb6e2
--- /dev/null
+++ b/docs/tutorial/files/geotiffmetadata-sedona-spark.zh.md
@@ -0,0 +1,186 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# GeoTiffMetadata - GeoTIFF 文件元数据
+
+`GeoTiffMetadata` 是 Spark 数据源，用于读取 GeoTIFF 文件的元数据，而不解码像素数据，行为类似于 [gdalinfo](https://gdal.org/en/stable/programs/gdalinfo.html)。它会为每个文件返回一行，包含尺寸、坐标系、波段信息、瓦片化、概览（overview）以及压缩等元数据。
+
+适用场景：
+
+* 对大批量栅格文件进行编目与盘点
+* 通过检查瓦片化与概览状态识别 Cloud Optimized GeoTIFF（COG）
+* 在加载完整栅格数据前先检查文件属性
+* 基于栅格文件集合构建空间索引
+
+![Schema 概览](../../image/geotiff_metadata/schema_overview.svg "geotiff.metadata 输出 schema")
+
+## 检测 COG
+
+Cloud Optimized GeoTIFF（COG）是带有内部瓦片与概览结构、面向云端访问优化过的 GeoTIFF 文件。`geotiff.metadata` 数据源会直接报告这些属性：
+
+![COG 结构](../../image/geotiff_metadata/cog_structure.svg "COG 属性如何映射到 geotiff.metadata 字段")
+
+```python
+df = sedona.read.format("geotiff.metadata").load("/path/to/rasters/")
+cogs = df.filter("isTiled AND size(overviews) > 0")
+cogs.select("path", "compression", "overviews").show(truncate=False)
+```
+
+## 读取 GeoTIFF 元数据
+
+=== "Scala"
+
+    ```scala
+    val df = sedona.read.format("geotiff.metadata").load("/path/to/rasters/")
+    df.show()
+    ```
+
+=== "Java"
+
+    ```java
+    Dataset<Row> df = sedona.read().format("geotiff.metadata").load("/path/to/rasters/");
+    df.show();
+    ```
+
+=== "Python"
+
+    ```python
+    df = sedona.read.format("geotiff.metadata").load("/path/to/rasters/")
+    df.show()
+    ```
+
+也支持 glob 通配符：
+
+```python
+df = sedona.read.format("geotiff.metadata").load("/path/to/rasters/*.tif")
+```
+
+或加载单个文件：
+
+```python
+df = sedona.read.format("geotiff.metadata").load("/path/to/image.tiff")
+```
+
+## 输出 schema
+
+每行代表一个 GeoTIFF 文件，包含以下列：
+
+| 列 | 类型 | 说明 |
+|--------|------|-------------|
+| `path` | String | 文件路径 |
+| `driver` | String | 格式驱动（`"GTiff"`） |
+| `fileSize` | Long | 文件大小（字节） |
+| `width` | Int | 图像宽度（像素） |
+| `height` | Int | 图像高度（像素） |
+| `numBands` | Int | 波段数 |
+| `srid` | Int | EPSG 编号（未知时为 0） |
+| `crs` | String | 以 WKT 表示的坐标参考系 |
+| `geoTransform` | Struct | 仿射变换参数 |
+| `cornerCoordinates` | Struct | 边界框 |
+| `bands` | Array[Struct] | 每个波段的元数据 |
+| `overviews` | Array[Struct] | 概览（金字塔）层级 |
+| `metadata` | Map[String, String] | 文件级 TIFF 元数据标签 |
+| `isTiled` | Boolean | 是否使用了内部瓦片化 |
+| `compression` | String | 压缩类型（如 `"LZW"`、`"Deflate"`） |
+
+### geoTransform 结构体
+
+| 字段 | 类型 | 说明 |
+|-------|------|-------------|
+| `upperLeftX` | Double | 世界坐标系下的原点 X |
+| `upperLeftY` | Double | 世界坐标系下的原点 Y |
+| `scaleX` | Double | X 方向的像素大小 |
+| `scaleY` | Double | Y 方向的像素大小 |
+| `skewX` | Double | X 方向的旋转/剪切 |
+| `skewY` | Double | Y 方向的旋转/剪切 |
+
+### cornerCoordinates 结构体
+
+| 字段 | 类型 | 说明 |
+|-------|------|-------------|
+| `minX` | Double | X 最小值（西） |
+| `minY` | Double | Y 最小值（南） |
+| `maxX` | Double | X 最大值（东） |
+| `maxY` | Double | Y 最大值（北） |
+
+### bands 数组元素
+
+| 字段 | 类型 | 说明 |
+|-------|------|-------------|
+| `band` | Int | 波段编号（从 1 开始） |
+| `dataType` | String | 数据类型（如 `"REAL_32BITS"`） |
+| `colorInterpretation` | String | 颜色解释（如 `"Gray"`、`"Red"`） |
+| `noDataValue` | Double | NoData 值（未设置时为 null） |
+| `blockWidth` | Int | 内部 tile/block 宽度 |
+| `blockHeight` | Int | 内部 tile/block 高度 |
+| `description` | String | 波段描述 |
+| `unit` | String | 单位（如 `"meters"`） |
+
+### overviews 数组元素
+
+| 字段 | 类型 | 说明 |
+|-------|------|-------------|
+| `level` | Int | 概览层级（1, 2, 3, ...） |
+| `width` | Int | 概览宽度（像素） |
+| `height` | Int | 概览高度（像素） |
+
+## 示例
+
+### 查看波段信息
+
+```python
+df = sedona.read.format("geotiff.metadata").load("/path/to/image.tif")
+df.selectExpr("path", "explode(bands) as band").selectExpr(
+    "path",
+    "band.band",
+    "band.dataType",
+    "band.noDataValue",
+    "band.blockWidth",
+    "band.blockHeight",
+).show()
+```
+
+### 按空间范围过滤
+
+```python
+df = sedona.read.format("geotiff.metadata").load("/path/to/rasters/")
+df.filter("cornerCoordinates.minX > -120 AND cornerCoordinates.maxX < -100").select(
+    "path", "width", "height", "srid"
+).show()
+```
+
+### 获取概览详情
+
+```python
+df = sedona.read.format("geotiff.metadata").load("/path/to/image.tif")
+df.selectExpr("path", "explode(overviews) as ovr").selectExpr(
+    "path", "ovr.level", "ovr.width", "ovr.height"
+).show()
+```
+
+### 仅选取需要的列
+
+```python
+df = (
+    sedona.read.format("geotiff.metadata")
+    .load("/path/to/rasters/")
+    .select("path", "width", "height", "numBands")
+)
+df.show()
+```
diff --git a/docs/tutorial/files/shapefiles-sedona-spark.zh.md b/docs/tutorial/files/shapefiles-sedona-spark.zh.md
new file mode 100644
index 00000000000..9ff7386c05a
--- /dev/null
+++ b/docs/tutorial/files/shapefiles-sedona-spark.zh.md
@@ -0,0 +1,219 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 在 Spark 上使用 Apache Sedona 处理 Shapefile
+
+本文介绍如何使用 Apache Sedona 与 Spark 读取 Shapefile。
+
+Shapefile 是 “Esri 矢量数据存储格式，用于存储地理要素的位置、形状与属性”。Shapefile 格式由 Esri 私有，但 [规范是公开的](https://www.esri.com/content/dam/esrisites/sitecore-archive/Files/Pdfs/library/whitepapers/pdfs/shapefile.pdf)。
+
+Shapefile 有不少限制，但仍被广泛使用，因此 Sedona 能读取它非常有价值。
+
+下面看如何用 Sedona 与 Spark 读取 Shapefile。
+
+## 使用 Sedona 与 Spark 读取 Shapefile
+
+先用 GeoPandas 与 Shapely 创建一个 Shapefile：
+
+```python
+import geopandas as gpd
+from shapely.geometry import Point
+
+point1 = Point(0, 0)
+point2 = Point(1, 1)
+
+data = {"name": ["Point A", "Point B"], "value": [10, 20], "geometry": [point1, point2]}
+
+gdf = gpd.GeoDataFrame(data, geometry="geometry")
+gdf.to_file("/tmp/my_geodata.shp")
+```
+
+输出的文件如下：
+
+```
+/tmp/
+  my_geodata.cpg
+  my_geodata.dbf
+  my_geodata.shp
+  my_geodata.shx
+```
+
+Shapefile 并不是单一文件，其数据分散在多个文件中。
+
+将 Shapefile 读入由 Spark 驱动的 Sedona DataFrame：
+
+```python
+df = sedona.read.format("shapefile").load("/tmp/my_geodata.shp")
+df.show()
+```
+
+```
++-----------+-------+-----+
+|   geometry|   name|value|
++-----------+-------+-----+
+|POINT (0 0)|Point A|   10|
+|POINT (1 1)|Point B|   20|
++-----------+-------+-----+
+```
+
+也可以一并读取每行的唯一 record number：
+
+```python
+df = (
+    sedona.read.format("shapefile")
+    .option("key.name", "FID")
+    .load("/tmp/my_geodata.shp")
+)
+```
+
+```
++-----------+---+-------+-----+
+|   geometry|FID|   name|value|
++-----------+---+-------+-----+
+|POINT (0 0)|  1|Point A|   10|
+|POINT (1 1)|  2|Point B|   20|
++-----------+---+-------+-----+
+```
+
+几何列默认名为 `geometry`。可通过 `geometry.name` 选项修改。如果某个非空间属性恰好叫 `geometry`，则必须配置 `geometry.name` 以避免冲突：
+
+```python
+df = (
+    sedona.read.format("shapefile")
+    .option("geometry.name", "geom")
+    .load("/path/to/shapefile")
+)
+```
+
+字符串属性的字符编码会从 `.cpg` 文件推断。如果字符串字段出现乱码，可以通过 `charset` 选项手动指定正确的字符集，例如：
+
+=== "Scala/Java"
+
+    ```scala
+    val df = sedona.read.format("shapefile").option("charset", "UTF-8").load("/path/to/shapefile")
+    ```
+
+=== "Java"
+
+    ```java
+    Dataset<Row> df = sedona.read().format("shapefile").option("charset", "UTF-8").load("/path/to/shapefile")
+    ```
+
+=== "Python"
+
+    ```python
+    df = (
+        sedona.read.format("shapefile")
+        .option("charset", "UTF-8")
+        .load("/path/to/shapefile")
+    )
+    ```
+
+下面看如何在 Sedona DataFrame 中加载多个 Shapefile。
+
+## 使用 Sedona 加载多个 Shapefile
+
+假设有如下目录结构：
+
+```
+/tmp/shapefiles/
+  file1.cpg
+  file1.dbf
+  file1.shp
+  file1.shx
+  file2.cpg
+  file2.dbf
+  file2.shp
+  file2.shx
+```
+
+目录中包含 2 个 `.shp` 以及对应的辅助文件。
+
+可以这样把多个 Shapefile 加载到 Sedona DataFrame 中：
+
+```python
+df = sedona.read.format("shapefile").load("/tmp/shapefiles")
+df.show()
+```
+
+```
++-----------+-------+-----+
+|   geometry|   name|value|
++-----------+-------+-----+
+|POINT (0 0)|Point A|   10|
+|POINT (1 1)|Point B|   20|
+|POINT (2 2)|Point C|   10|
+|POINT (3 3)|Point D|   20|
++-----------+-------+-----+
+```
+
+只需将 Shapefile 所在目录传入，Sedona 即可识别加载。
+
+输入路径既可以是包含一个或多个 Shapefile 的目录，也可以是 `.shp` 文件本身：
+
+* 输入是目录时，会加载该目录下直接存在的所有 Shapefile。如果还需要加载子目录中的 Shapefile，请加上 `.option("recursiveFileLookup", "true")`。
+* 输入是 `.shp` 文件时，Sedona 会自动查找同名的 `.dbf`、`.shx` 等同伴文件并一并加载。
+
+## Shapefile 的优势
+
+Shapefile 与 Esri 生态深度集成，并被大量服务广泛使用。
+
+可以从 Esri 输出 Shapefile，再用 Sedona 等其他引擎读取。
+
+不过 Esri 在上世纪 90 年代初创建了 Shapefile 格式，因此存在不少局限。
+
+## Shapefile 的限制
+
+Shapefile 的部分缺点包括：
+
+* 不支持复杂几何类型
+* 不支持 NULL 值
+* 会对数字进行四舍五入
+* 对 Unicode 支持较差
+* 字段名不能太长
+* 单文件大小限制为 2GB
+* 空间索引相比其他方案较慢
+* 无法存储日期时间
+
+更多关于 [Shapefile 局限性的资料](http://switchfromshapefile.org/) 见此页。
+
+由于这些限制，建议考虑其他更现代的格式。
+
+## Shapefile 的替代方案
+
+适合存储几何数据的格式有很多：
+
+* Iceberg
+* [GeoParquet](geoparquet-sedona-spark.md)
+* FlatGeoBuf
+* [GeoPackage](geopackage-sedona-spark.md)
+* [GeoJSON](geojson-sedona-spark.md)
+* [CSV](csv-geometry-sedona-spark.md)
+* GeoTIFF
+
+## 为什么 Sedona 不支持写出 Shapefile
+
+Sedona 不写出 Shapefile 主要有两个原因：
+
+1. 每个 Shapefile 由多个文件组成，对分布式系统来说写出比较困难。
+2. Shapefile 单文件 2GB 的硬限制对一些空间数据来说不够用。
+
+## 结论
+
+Shapefile 是仍在许多生产应用中使用的遗留格式。但其限制颇多，除非需要兼容旧系统，否则在现代数据管道中并不是最佳选择。
diff --git a/docs/tutorial/flink/pyflink-sql.zh.md b/docs/tutorial/flink/pyflink-sql.zh.md
new file mode 100644
index 00000000000..e06d476feb4
--- /dev/null
+++ b/docs/tutorial/flink/pyflink-sql.zh.md
@@ -0,0 +1,72 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+要在 Apache Sedona 中配置 PyFlink，请先按 [PyFlink](../../setup/flink/install-python.md) 指南完成安装。
+完成后，可以运行下面的代码以验证环境是否正常工作。
+
+```python
+from sedona.flink import SedonaContext
+from pyflink.datastream import StreamExecutionEnvironment
+from pyflink.table import EnvironmentSettings, StreamTableEnvironment
+
+stream_env = StreamExecutionEnvironment.get_execution_environment()
+flink_settings = EnvironmentSettings.in_streaming_mode()
+table_env = SedonaContext.create(stream_env, flink_settings)
+
+table_env.sql_query("SELECT ST_Point(1.0, 2.0)").execute()
+```
+
+PyFlink 不支持把 Scala 自定义类型（UDT）转换为 Python UDT。
+因此，如果想在 Python 中收集结果，需要使用 `ST_AsText` 或 `ST_ASBinary` 等函数把结果转换为字符串或二进制。
+
+```python
+from shapely.wkb import loads
+
+table_env.sql_query("SELECT ST_ASBinary(ST_Point(1.0, 2.0))").execute().collect()
+
+[loads(bytes(el[0])) for el in result]
+```
+
+```
+[<POINT (1 2)>]
+```
+
+用户自定义标量函数（UDF）也是类似的处理方式：
+
+```python
+from pyflink.table.udf import ScalarFunction, udf
+from shapely.wkb import loads
+
+
+class Buffer(ScalarFunction):
+    def eval(self, s):
+        geom = loads(s)
+        return geom.buffer(1).wkb
+
+
+table_env.create_temporary_function(
+    "ST_BufferPython", udf(Buffer(), result_type="Binary")
+)
+
+buffer_table = table_env.sql_query(
+    "SELECT ST_BufferPython(ST_ASBinary(ST_Point(1.0, 2.0))) AS buffer"
+)
+```
+
+更多 SQL 示例请参阅 FlinkSQL 章节：[FlinkSQL](sql.md)。
diff --git a/docs/tutorial/jupyter-notebook.zh.md b/docs/tutorial/jupyter-notebook.zh.md
new file mode 100644
index 00000000000..d727045da56
--- /dev/null
+++ b/docs/tutorial/jupyter-notebook.zh.md
@@ -0,0 +1,58 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# Python Jupyter Notebook 示例
+
+Sedona Python 提供了一系列 [Jupyter Notebook 示例](https://github.com/apache/sedona/blob/master/docs/usecases/)。
+
+请按以下步骤使用 Pipenv 在本机运行 Jupyter Notebook：
+
+1. 克隆 Sedona GitHub 仓库或下载源代码
+2. 从 PyPI 或 GitHub 源码安装 Sedona Python：参考 [安装 Sedona Python](../setup/install-python.md#install-sedona)。
+3. 准备 spark-shaded jar：参考 [安装 Sedona Python](../setup/install-python.md#prepare-sedona-spark-jar)。
+4. 设置 Pipenv 的 Python 版本（请使用您所需的 Python 版本）：
+
+```bash
+cd docs/usecases
+pipenv --python 3.8
+```
+
+5. 安装依赖：
+
+```bash
+cd docs/usecases
+pipenv install
+```
+
+6. 在 Pipenv 中安装 Jupyter Notebook 内核：
+
+```bash
+pipenv install ipykernel
+pipenv shell
+```
+
+7. 在 Pipenv shell 中执行：
+
+```bash
+python -m ipykernel install --user --name=apache-sedona
+```
+
+8. 如果之前未配置过，请设置环境变量 `SPARK_HOME` 与 `PYTHONPATH`，参考 [安装 Sedona Python](../setup/install-python.md/#setup-environment-variables)。
+9. 启动 Jupyter Notebook：`jupyter notebook`
+10. 选择 Sedona notebook，在 notebook 中依次选择 Kernel -> Change Kernel，选择刚才注册的内核。
diff --git a/docs/tutorial/sql-pure-sql.zh.md b/docs/tutorial/sql-pure-sql.zh.md
new file mode 100644
index 00000000000..de4b9b6dc3d
--- /dev/null
+++ b/docs/tutorial/sql-pure-sql.zh.md
@@ -0,0 +1,101 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+从 ==Sedona v1.0.1== 开始，您可以在纯 Spark SQL 环境中使用 Sedona，示例代码均以 SQL 编写。
+
+SedonaSQL 支持 SQL/MM Part3 空间 SQL 标准。SedonaSQL 详细的 API 说明请参阅 [SedonaSQL API](../api/sql/Overview.md)。
+
+## 启动会话
+
+按以下方式启动 `spark-sql`（请将 `<VERSION>` 替换为实际版本，如 `{{ sedona.current_version }}`）：
+
+!!! abstract "使用 Apache Sedona 启动 spark-sql"
+
+	=== "Spark 3.3+ 与 Scala 2.12"
+
+        ```sh
+        spark-sql --packages org.apache.sedona:sedona-spark-shaded-3.3_2.12:<VERSION>,org.datasyslab:geotools-wrapper:{{ sedona.current_geotools }} \
+          --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
+          --conf spark.kryo.registrator=org.apache.sedona.viz.core.Serde.SedonaVizKryoRegistrator \
+          --conf spark.sql.extensions=org.apache.sedona.viz.sql.SedonaVizExtensions,org.apache.sedona.sql.SedonaSqlExtensions
+        ```
+
+        请将 artifact 名称中的 `3.3` 替换为对应的 Spark major.minor 版本。
+
+这会注册 SedonaSQL 与 SedonaViz 的全部类型、函数与优化规则。
+
+## 加载数据
+
+下面使用 `examples/sql` 目录中的数据。从 CSV 文件加载数据需要执行以下两条命令：
+
+使用以下代码加载数据并创建原始 DataFrame：
+
+```sql
+CREATE TABLE IF NOT EXISTS pointraw (_c0 string, _c1 string)
+USING csv
+OPTIONS(header='false')
+LOCATION '<some path>/sedona/examples/sql/src/test/resources/testpoint.csv';
+
+CREATE TABLE IF NOT EXISTS polygonraw (_c0 string, _c1 string, _c2 string, _c3 string)
+USING csv
+OPTIONS(header='false')
+LOCATION '<some path>/sedona/examples/sql/src/test/resources/testenvelope.csv';
+
+```
+
+## 转换数据
+
+需要把点和多边形数据转换为对应的几何类型：
+
+```sql
+CREATE OR REPLACE TEMP VIEW pointdata AS
+  SELECT ST_Point(cast(pointraw._c0 as Decimal(24,20)), cast(pointraw._c1 as Decimal(24,20))) AS pointshape
+  FROM pointraw;
+
+CREATE OR REPLACE TEMP VIEW polygondata AS
+  select ST_PolygonFromEnvelope(cast(polygonraw._c0 as Decimal(24,20)),
+        cast(polygonraw._c1 as Decimal(24,20)), cast(polygonraw._c2 as Decimal(24,20)),
+        cast(polygonraw._c3 as Decimal(24,20))) AS polygonshape
+  FROM polygonraw;
+```
+
+## 处理数据
+
+例如，对多边形和点数据做一次连接：
+
+```sql
+SELECT * from polygondata, pointdata
+WHERE ST_Contains(polygondata.polygonshape, pointdata.pointshape)
+      AND ST_Contains(ST_PolygonFromEnvelope(1.0,101.0,501.0,601.0), polygondata.polygonshape)
+LIMIT 5;
+```
+
+## `GEOMETRY` 数据类型支持
+
+Sedona 提供了一个 Spark SQL 解析器扩展，使 DDL 语句中可以直接使用 `GEOMETRY` 数据类型。例如，可以在创建表时声明带几何列的 schema：
+
+```sql
+CREATE TABLE geom_table (id STRING, version INT, geometry GEOMETRY)
+USING geoparquet
+LOCATION '/path/to/geoparquet_geom_table';
+
+SELECT * FROM geom_table LIMIT 10;
+```
+
+该 SQL 解析器扩展默认启用。如果它与其他扩展存在冲突需要禁用，请在启动 `spark-sql` 时通过 `--conf spark.sedona.enableParserExtension=false` 关闭。
diff --git a/docs/tutorial/storing-blobs-in-parquet.zh.md b/docs/tutorial/storing-blobs-in-parquet.zh.md
new file mode 100644
index 00000000000..50a01b6564e
--- /dev/null
+++ b/docs/tutorial/storing-blobs-in-parquet.zh.md
@@ -0,0 +1,67 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 在 Parquet 文件中存储大型栅格几何对象
+
+!!!warning
+    在保存栅格几何对象之前，务必先用 `RS_AsXXX` 函数将其转换为公认的标准格式。
+    虽然可以保存栅格几何对象的原始字节，但这是 Sedona 的内部格式，不保证跨版本稳定。
+
+Spark 的默认设置并不适合存储栅格几何对象这类大型二进制数据。
+为此投入时间调优和基准测试是非常值得的。
+使用默认设置写入大型二进制数据会得到结构很差、读取代价非常高的 Parquet 文件。
+做一些基本调优可以让读取性能提升数个数量级。
+
+## 背景
+
+Parquet 文件被划分为一个或多个 row group。
+每个 row group 中的每一列存储为一个 column chunk，
+每个 column chunk 又进一步划分为 page。
+page 在压缩与编码层面是不可分割的最小单位，默认大小为 1 MB。
+数据先在内存中缓冲，写满 page 后再落盘写入。
+对 page 大小的检查频率介于 `parquet.page.size.row.check.min` 与 `parquet.page.size.row.check.max` 之间（默认在 100 到 10000 行之间）。
+
+如果您按默认设置将 5 MB 的图像文件写入 Parquet，第一次 page 大小检查会在 100 行后才发生。
+这样得到的 page 会是 500 MB 而非 1 MB。
+读取这种文件会消耗大量内存，速度也会很慢。
+
+## 读取结构不佳的 Parquet 文件
+
+snappy 压缩对超大 page 尤为敏感。
+更合适的选择是不压缩或使用 zstd 压缩。
+您可以将 `spark.buffer.size` 设置为大于默认 64k 的值以提升读取性能。
+不过调大 `spark.buffer.size` 可能会让 Parquet 文件中的其他列承担额外的 I/O 开销。
+
+## 为大块二进制数据写出结构更佳的 Parquet 文件
+
+理想情况下，您希望以合理的 page 大小写出 Parquet 文件，从而在不同客户端读取时都能获得更好且更一致的性能。
+自 parquet-hadoop 1.12.0 起（Spark 3.2 内置该版本），可以通过 Hadoop 属性来控制 page 大小检查。
+更适合写大块数据的设置如下：
+
+```
+spark.sql.parquet.compression.codec=zstd
+spark.hadoop.parquet.page.size.row.check.min=2
+spark.hadoop.parquet.page.size.row.check.max=10
+```
+
+整体上 zstd 比 snappy 表现更好，对于大型 page 更是如此。
+第一次 page 大小检查会在 2 行后进行；如果 2 行后 page 仍未写满，下一次检查会在再写 2 到 10 行后发生（具体取决于已写入两行的字节数）。
+
+Spark 会把以 “spark.hadoop.” 为前缀的 Spark 属性映射成 Hadoop 属性。
+完整的 Parquet Hadoop 属性列表请参考：https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/README.md
diff --git a/docs/tutorial/viz-gallery.zh.md b/docs/tutorial/viz-gallery.zh.md
new file mode 100644
index 00000000000..7c027dfffe3
--- /dev/null
+++ b/docs/tutorial/viz-gallery.zh.md
@@ -0,0 +1,24 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+![美国铁路](../image/usrail.png)
+
+![美国推文](../image/ustweet.png)
+
+![纽约市热力图](../image/heatmapnycsmall.png)
diff --git a/docs/tutorial/viz.zh.md b/docs/tutorial/viz.zh.md
new file mode 100644
index 00000000000..e985f1db020
--- /dev/null
+++ b/docs/tutorial/viz.zh.md
@@ -0,0 +1,247 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+本页介绍如何使用 SedonaViz 可视化空间数据。==示例代码使用 Scala 编写，但同样适用于 Java==。
+
+SedonaViz 通过对 Sedona 处理大规模空间数据的能力进行扩展，原生支持通用的地图制图设计。它可以可视化空间 RDD 与空间查询，并以并行方式渲染超高分辨率图像。
+
+SedonaViz 提供了 Map Visualization SQL，使用户能更灵活地设计美观的地图可视化效果，包括散点图与热力图。同时也提供 SedonaViz RDD API。
+
+!!!note
+	SedonaViz 的 SQL/DataFrame API 全部说明请参阅 [SedonaViz API](../api/viz/sql.md)。
+
+## 为什么需要可扩展的地图可视化
+
+数据可视化让用户能够对数据进行总结、分析与推理。要在多个缩放级别上保证细致而准确的地理空间地图可视化，需要极高分辨率的地图。Google Maps、MapBox、ArcGIS 等传统方案受计算资源限制，对大规模地理空间数据生成地图需要花费大量时间。在大空间数据场景下，这类工具往往直接崩溃或长时间无法完成。
+
+SedonaViz 把地图可视化流程的主要步骤（如像素化、聚合、渲染）封装为一组可大规模并行的 GeoViz 算子，用户可以自由组合任意自定义样式。
+
+## 可视化 SpatialRDD
+
+本教程主要介绍 SQL/DataFrame API。
+
+## 配置依赖
+
+1. 阅读 [Sedona Maven Central 坐标](../setup/maven-coordinates.md)
+2. 添加 [Apache Spark core](https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11)、[Apache SparkSQL](https://mvnrepository.com/artifact/org.apache.spark/spark-sql)、Sedona-core、Sedona-SQL、Sedona-Viz 依赖
+
+## 创建 Sedona 配置
+
+在程序开头使用以下代码创建 Sedona 配置。如果您已经有了由 Wherobots / AWS EMR / Databricks 创建的 SparkSession（通常名为 `spark`），可跳过此步骤直接使用 `spark`。
+
+==Sedona >= 1.4.1===
+
+```scala
+val config = SedonaContext.builder()
+		.config("spark.kryo.registrator", classOf[SedonaVizKryoRegistrator].getName) // org.apache.sedona.viz.core.Serde.SedonaVizKryoRegistrator
+		.master("local[*]") // 集群模式下请删除此行
+		.appName("Sedona Viz") // 改成合适的名字
+		.getOrCreate()
+```
+
+==Sedona <1.4.1==
+
+下面这种方式自 Sedona 1.4.1 起已弃用，请改用上面的方式。
+
+```scala
+var sparkSession = SparkSession.builder()
+.master("local[*]") // 集群模式下请删除此行
+.appName("Sedona Viz") // 改成合适的名字
+// 启用 Sedona 自定义 Kryo 序列化器
+.config("spark.serializer", classOf[KryoSerializer].getName) // org.apache.spark.serializer.KryoSerializer
+.config("spark.kryo.registrator", classOf[SedonaVizKryoRegistrator].getName) // org.apache.sedona.viz.core.Serde.SedonaVizKryoRegistrator
+.getOrCreate()
+```
+
+## 初始化 SedonaContext
+
+在创建 Sedona 配置之后加上以下代码。如果您已经有了由 Wherobots / AWS EMR / Databricks 创建的 SparkSession（通常名为 `spark`），请改为调用 `SedonaContext.create(spark)`。
+
+==Sedona >= 1.4.1===
+
+```scala
+val sedona = SedonaContext.create(config)
+SedonaVizRegistrator.registerAll(sedona)
+```
+
+==Sedona <1.4.1==
+
+下面这种方式自 Sedona 1.4.1 起已弃用，请改用上面的方式创建 SedonaContext。
+
+```scala
+SedonaSQLRegistrator.registerAll(sparkSession)
+SedonaVizRegistrator.registerAll(sparkSession)
+```
+
+也可以通过在 `spark-submit` 或 `spark-shell` 中传入 `--conf spark.sql.extensions=org.apache.sedona.viz.sql.SedonaVizExtensions,org.apache.sedona.sql.SedonaSqlExtensions` 来一并注册。
+
+## 创建空间 DataFrame
+
+假设有如下 DataFrame：
+
+```
++----------+---------+
+|       _c0|      _c1|
++----------+---------+
+|-88.331492|32.324142|
+|-88.175933|32.360763|
+|-88.388954|32.357073|
+|-88.221102| 32.35078|
+```
+
+首先需要构造一列几何类型的列：
+
+```sql
+CREATE OR REPLACE TEMP VIEW pointtable AS
+SELECT ST_Point(cast(pointtable._c0 as Decimal(24,20)),cast(pointtable._c1 as Decimal(24,20))) as shape
+FROM pointtable
+```
+
+Sedona 提供了多种方式加载各种空间数据格式，详见 [编写空间 DataFrame 应用](sql.md)。
+
+## 生成单张图像
+
+大多数情况下，您只想从空间数据中得到一张图像。
+
+### 像素化空间对象
+
+要在地图图像上显示空间对象，先要把它们转换为像素。
+
+首先计算该列的空间边界：
+
+```sql
+CREATE OR REPLACE TEMP VIEW boundtable AS
+SELECT ST_Envelope_Aggr(shape) as bound FROM pointtable
+```
+
+然后用 ST_Pixelize 将其转换为像素。
+
+下面这段代码适用于 Sedona v1.0.1 之前。`ST_Pixelize` 继承自 Generator，因此可以直接展开数组而无需 **explode** 函数。
+
+```sql
+CREATE OR REPLACE TEMP VIEW pixels AS
+SELECT pixel, shape FROM pointtable
+LATERAL VIEW ST_Pixelize(ST_Transform(shape, 'epsg:4326','epsg:3857'), 256, 256, (SELECT ST_Transform(bound, 'epsg:4326','epsg:3857') FROM boundtable)) AS pixel
+```
+
+下面这段代码适用于 Sedona v1.0.1 及之后。`ST_Pixelize` 返回像素数组，需要使用 **explode** 展开：
+
+```sql
+CREATE OR REPLACE TEMP VIEW pixels AS
+SELECT pixel, shape FROM pointtable
+LATERAL VIEW explode(ST_Pixelize(ST_Transform(shape, 'epsg:4326','epsg:3857'), 256, 256, (SELECT ST_Transform(bound, 'epsg:4326','epsg:3857') FROM boundtable))) AS pixel
+```
+
+执行完本教程末尾的 `ST_Render` 后，将得到 256*256 分辨率的图像。
+
+!!!warning
+	强烈建议先用 `ST_Transform` 将坐标转换到适合可视化的坐标系（如 epsg:3857），否则地图可能出现变形。
+
+### 聚合像素
+
+许多对象可能被像素化到同一个像素位置。需要按空间聚合或按温度、湿度等空间观测值进行聚合：
+
+```sql
+CREATE OR REPLACE TEMP VIEW pixelaggregates AS
+SELECT pixel, count(*) as weight
+FROM pixels
+GROUP BY pixel
+```
+
+`weight` 表示空间聚合或空间观测的程度，后续会决定该像素的颜色。
+
+### 给像素上色
+
+运行以下命令为像素根据 weight 上色：
+
+```sql
+CREATE OR REPLACE TEMP VIEW pixelaggregates AS
+SELECT pixel, ST_Colorize(weight, (SELECT max(weight) FROM pixelaggregates)) as color
+FROM pixelaggregates
+```
+
+详细 API 说明请参阅 [ST_Colorize](../api/viz/sql.md#st_colorize)。
+
+### 渲染图像
+
+使用 `ST_Render` 把所有像素绘制到一张图像上：
+
+```sql
+CREATE OR REPLACE TEMP VIEW images AS
+SELECT ST_Render(pixel, color) AS image, (SELECT ST_AsText(bound) FROM boundtable) AS boundary
+FROM pixelaggregates
+```
+
+该 DataFrame 中将包含一列 Image 类型的列，且只有一张图像。
+
+### 将图像保存到磁盘
+
+从上面的 DataFrame 中取出图像：
+
+```
+var image = sedona.table("images").take(1)(0)(0).asInstanceOf[ImageSerializableWrapper].getImage
+```
+
+使用 Sedona Viz 的 `ImageGenerator` 把图像保存到磁盘：
+
+```scala
+var imageGenerator = new ImageGenerator
+imageGenerator.SaveRasterImageAsLocalFile(image, System.getProperty("user.dir")+"/target/points", ImageType.PNG)
+```
+
+## 生成地图瓦片
+
+如果您是地图相关从业者，可能需要为不同缩放级别生成地图瓦片，最终构建出地图瓦片图层。
+
+### 像素化与像素聚合
+
+请先用与单张图像生成相同的命令完成像素化与像素聚合。在 `ST_Pixelize` 中需要指定较高的分辨率，例如 1000*1000。注意：每个维度都应能被 2^zoom-level 整除。
+
+### 计算 tile name
+
+使用以下命令为每个像素计算 tile name：
+
+```sql
+CREATE OR REPLACE TEMP VIEW pixelaggregates AS
+SELECT pixel, weight, ST_TileName(pixel, 3) AS pid
+FROM pixelaggregates
+```
+
+其中 `3` 表示这些地图瓦片的缩放级别。
+
+### 像素上色
+
+使用与单张图像生成相同的命令进行上色。
+
+### 渲染地图瓦片
+
+把像素按 tile 分组，然后并行渲染各个瓦片图像：
+
+```sql
+CREATE OR REPLACE TEMP VIEW images AS
+SELECT ST_Render(pixel, color, 3) AS image
+FROM pixelaggregates
+GROUP BY pid
+```
+
+`3` 是这些地图瓦片的缩放级别。
+
+### 将地图瓦片保存到磁盘
+
+可以沿用单张图像生成中的命令，将所有地图瓦片逐一取出并保存。
diff --git a/docs/tutorial/zeppelin.zh.md b/docs/tutorial/zeppelin.zh.md
new file mode 100644
index 00000000000..8e8f70b3b70
--- /dev/null
+++ b/docs/tutorial/zeppelin.zh.md
@@ -0,0 +1,85 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+Sedona 提供了一个为 [Apache Zeppelin](https://zeppelin.apache.org/) 量身定制的 Helium 可视化插件，最终弥合了 Sedona 与 Zeppelin 之间的鸿沟。安装方法请参阅 [安装 Sedona-Zeppelin](../setup/zeppelin.md)。
+
+Sedona-Zeppelin 提供了两种在 Zeppelin 中可视化空间数据的方式。第一种方式使用 Zeppelin 在地图上绘制所有空间对象；第二种方式利用 SedonaViz 生成地图图像并叠加到地图上。
+
+## 小规模数据：不使用 SedonaViz
+
+!!! danger
+	Zeppelin 只是一个前端可视化框架，本方式不具备扩展性，对于大规模地理空间数据会失败。请向下阅读 SedonaViz 方案。
+
+可以使用 Apache Zeppelin 绘制少量空间对象，例如 1000 个点。假设您已经有一个空间 DataFrame，需要在 Zeppelin 的 Spark notebook 中通过 Scala paragraph 将几何列转换为 WKT 字符串列：
+
+```scala
+spark.sql(
+  """
+    |CREATE OR REPLACE TEMP VIEW wktpoint AS
+    |SELECT ST_AsText(shape) as geom
+    |FROM pointtable
+  """.stripMargin)
+```
+
+然后再创建一个 SQL paragraph 拉取数据：
+
+```sql
+%sql
+SELECT *
+FROM wktpoint
+```
+
+选择要可视化的几何列：
+
+![创建 SQL paragraph 并选择几何列](../image/sql-zeppelin.gif)
+
+## 大规模数据：使用 SedonaViz
+
+SedonaViz 是一个分布式可视化系统，能够大规模地可视化空间数据。请阅读 [如何使用 SedonaViz](viz.md)。
+
+可以借助 Sedona-Zeppelin 让 Zeppelin 把 SedonaViz 的图像叠加到地图上，从而轻松可视化 10 亿乃至更多的空间对象（取决于集群规模）。
+
+首先，在 Zeppelin Spark notebook 的 Scala paragraph 中对 SedonaViz DataFrame 中的图像进行编码：
+
+```
+spark.sql(
+  """
+    |CREATE OR REPLACE TEMP VIEW images AS
+    |SELECT ST_EncodeImage(image) AS image, (SELECT ST_AsText(bound) FROM boundtable) AS boundary
+    |FROM images
+  """.stripMargin)
+```
+
+然后创建 SQL paragraph 拉取数据：
+
+```sql
+%sql
+SELECT *, 'I am the map center!'
+FROM images
+```
+
+选择图像与对应的地理边界：
+
+![选择图像与边界](../image/viz-zeppelin.gif)
+
+## Zeppelin Spark notebook 演示
+
+我们提供了一个完整的 Zeppelin Spark notebook，演示了所有功能。请下载 [Sedona-Zeppelin notebook 模板](../image/geospark-zeppelin-demo.json) 与 [测试数据 - arealm.csv](../image/arealm.csv)。
+
+需要在 Zeppelin 中导入该 notebook JSON 文件，并修改 notebook 中的输入数据路径。