================================================================================================
Benchmark to measure CSV read/write performance
================================================================================================

OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Parsing quoted values:                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
One quoted string                                 24317          24376          95          0.0      486343.1       1.0X

OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Wide rows with 1000 columns:              Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Select 1000 columns                               56420          56835         704          0.0       56420.3       1.0X
Select 100 columns                                20565          20673         113          0.0       20564.7       2.7X
Select one column                                 17105          17145          38          0.1       17105.4       3.3X
count()                                            3378           3428          68          0.3        3378.0      16.7X
Select 100 columns, one bad input field           24702          24731          37          0.0       24702.1       2.3X
Select 100 columns, corrupt record field          28027          28093          91          0.0       28026.7       2.0X

OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Count a dataset with 10 columns:          Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Select 10 columns + count()                       10764          10804          35          0.9        1076.4       1.0X
Select 1 column + count()                          7422           7424           1          1.3         742.2       1.5X
count()                                            1679           1682           3          6.0         167.9       6.4X

OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Write dates and timestamps:               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Create a dataset of timestamps                      829            834           7         12.1          82.9       1.0X
to_csv(timestamp)                                  5601           5649          49          1.8         560.1       0.1X
write timestamps to files                          5733           5743          11          1.7         573.3       0.1X
Create a dataset of dates                           923            931           8         10.8          92.3       0.9X
to_csv(date)                                       4069           4071           4          2.5         406.9       0.2X
write dates to files                               4030           4035           6          2.5         403.0       0.2X

OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Read dates and timestamps:                                             Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------
read timestamp text from files                                                  1157           1161           4          8.6         115.7       1.0X
read timestamps from files                                                     11666          11677          12          0.9        1166.6       0.1X
infer timestamps from files                                                    23313          23345          47          0.4        2331.3       0.0X
read date text from files                                                       1061           1072          10          9.4         106.1       1.1X
read date from files                                                           10393          10406          11          1.0        1039.3       0.1X
infer date from files                                                          20923          20949          27          0.5        2092.3       0.1X
timestamp strings                                                               1215           1220           5          8.2         121.5       1.0X
parse timestamps from Dataset[String]                                          13441          13464          22          0.7        1344.1       0.1X
infer timestamps from Dataset[String]                                          24868          24942          91          0.4        2486.8       0.0X
date strings                                                                    1681           1682           1          5.9         168.1       0.7X
parse dates from Dataset[String]                                               12086          12095           8          0.8        1208.6       0.1X
from_csv(timestamp)                                                            11219          11323          92          0.9        1121.9       0.1X
from_csv(date)                                                                 10647          10658          10          0.9        1064.7       0.1X
infer error timestamps from Dataset[String] with default format                14771          14788          17          0.7        1477.1       0.1X
infer error timestamps from Dataset[String] with user-provided format          14792          14816          23          0.7        1479.2       0.1X
infer error timestamps from Dataset[String] with legacy format                 14780          14818          33          0.7        1478.0       0.1X

OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Filters pushdown:                         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
w/o filters                                        4307           4312           4          0.0       43066.8       1.0X
pushdown disabled                                  4358           4388          26          0.0       43575.3       1.0X
w/ filters                                          727            734           8          0.1        7267.3       5.9X

OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Interval:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Read as Intervals                                   761            764           2          0.4        2537.6       1.0X
Read Raw Strings                                    336            337           1          0.9        1120.9       2.3X


