![]() SPARK-20844: The Structured Streaming APIs are now GA and is no longer labeled experimental.Programming guides: Spark RDD Programming Guide and Spark SQL, DataFrames and Datasets Guide. SPARK-19464: Remove support for Hadoop 2.5 and earlier.SPARK-8425:Add blacklist mechanism for task scheduling.SPARK-18191: Port RDD API to use commit protocol.SPARK-13446: Support reading data from Hive metastore 2.0/2.1.SPARK-18209: More robust view canonicalization without full SQL expansion.SPARK-18703: Drop Staging Directories and Data Files after completion of Insertion/CTAS against Hive-serde Tables.SPARK-21079: Analyze Table Command on partitioned tables.SPARK-19610: Support for parsing multi-line CSV files.SPARK-18352: Support for parsing multi-line JSON files.SPARK-15352: Topology aware block replication.SPARK-18761: Uncancellable / unkillable tasks shouldn’t starve jobs of resources.SPARK-18775: Limit the max number of records written per file.SPARK-18362 SPARK-19918: File listing/IO improvements for CSV and JSON.SPARK-18186: Partial aggregation support of HiveUDAFFunction.SPARK-17949: Introduce a JVM object based aggregate operator.SPARK-17626: TPC-DS performance improvements using star-schema heuristics.SPARK-17080: Cost-based join re-ordering.SPARK-17075 SPARK-17076 SPARK-19020 SPARK-17077 SPARK-19350: Cardinality estimation for filter, join, aggregate, project and limit/sample operators.SPARK-19139: AES-based authentication mechanism for Spark.SPARK-17203: Data source options should always be case insensitive.SPARK-20576: Support generic hint function in Dataset/DataFrame.SPARK-18127: Add hooks and extension points to Spark.SPARK-20420: Add events to the external catalog.SPARK-19261: Support ALTER TABLE table_name ADD COLUMNS.SPARK-18350: Support session local timezone.SPARK-16475: Added Broadcast Hints BROADCAST, BROADCASTJOIN, and MAPJOIN, for SQL Queries.SPARK-18885: Unify CREATE TABLE syntax for data source and hive serde tables.SPARK-13721: Add support for LATERAL VIEW OUTER explode(). ![]() SPARK-19107: Support creating hive table with DataFrameWriter and Catalog.We have curated a list of high level changes here, grouped by major modules. You can consult JIRA for the detailed changes. To download Apache Spark 2.2.0, visit the downloads page. In addition, this release focuses more on usability, stability, and polish, resolving over 1100 tickets.Īdditionally, we are excited to announce that PySpark is now available in pypi. This release removes the experimental tag from Structured Streaming. Apache Spark 2.2.0 is the third release on the 2.x line.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |