A seemingly very simple SQL, as shown below, we solve nothing more than the total monthly sales
SELECT OrderMonth, SUM( OrderAmount) AS Amount
FROM FctOrderSales WITH(NOLOCK)
WHERE OrderMonth BETWEEN '2017-01-01' AND '2018-12-31'
GROUP BY OrderMonth
Once the amount of data in FctOrderSales grows, I am afraid that I won't wait for half an hour.
Plus, our users won't let the opportunity to "torture" our IT engineers so easily, just do statistics around the month, and maybe add product dimensions, regions, and so on. In this way, it is even more troublesome, and the eyeballs can't wait to stick to the screen, watching the passage of time one minute and one second, but unfortunately youthful~
In fact, you will definitely have a certain solution to solve this kind of inefficient query, such as:
1 plus an index
2 plus a partition
3 ETL first calculate aggregated data
4 . . .
The solution is always there. Here we take a look at another gameplay, columnar storage.
In the figure above, the data in the table is a typical row-based data page. An adjacent row of a row is stored on a data page, and a column is stored on a row with a column. The Columnar Storage Layout is columnar storage. The data of each column is stored in a data file. For example, date_key is stored in the date_key file in order, and Product_sk is also stored in the product_sk file in the order of the two-dimensional table. In each columnar storage file, the data stored on the corresponding row number is the column data of the corresponding row number in the table structure. That is to obtain the data of the 20th line in the original table structure, then to obtain the data of the 20th line from the files stored in these columns separately, and assemble them together!
Consider the following scenario, for example, we will analyze the sales of 2013, each month, two varieties, 69, 31:
SELECT
getMonth(date_Key) AS Month
getProductName(product_sk) AS Product
SUM(quanTIty) AS QuanTIty
FROM FctSalesOrdinary
GROUP BY getMonth(date_Key), getProductName(product_sk)
Here, let's make these assumptions:
1 date_key is stored in a data file, and product_sk is stored in another data file. There are 200W sales data for the whole year of 2013, and each file segment can store 100W (calculated according to the storage capacity of SQL Server), so that a total of 2 segments are fetched, according to the mechanism of reading one segment at a time. Reading two consecutive segments in a row is only one read, so the head read does not need to be addressed again.
2 Assume that the first piece of data in 2013 is the 1 millionth piece of data in the source data table, and the last piece of data in 2013 is the 2.99 million pieces of data.
According to the above diagram, when we read the 2013 data, we read the field of product_sk, that is, read the 1 millionth data to 2.99 million pieces of data, and then make restrictions according to product_sk. Compared to the row-based data page, the other fields store, promoTIon, and customer fields are all discarded, and a lot of invalid data is read less.
Columnar storage also comes with a compressed option. Because each columnar storage file is stored as homogeneous data, compressing these homogeneous data will have good compression efficiency. The benefit of compression is that it reduces throughput, makes the memory more data, and can also effectively use the CPU L1 Cache. This technique is called vectorized processing.
*Reference to The Design and ImplementaTIon of Modern Column-Oriented Database systems.
Why columnar storage is more suitable for analytical data warehouses:
1. Analysis determines that data must be read for a wide range of continuous attributes. Not random reading, but sequential reading, much faster
2. The request is basically a way to read multiple dimensions simultaneously without reading all the columns. Such a large number of row-based data pages will erase the data of unnecessary adjacent columns.
3. Columnar storage mechanism: A data file stores the data of the entire column separately, segmented according to segment, and reads at least one segment at a time. A large amount of homogeneous data can be stored in one segment.
The data currently supported for columnar storage are:
Greenplum
PostgreSQL
MariaDB
Microsoft Azure SQL Data Warehouse
Microsoft SQL Server 2012 and above
BIRT Analytics ColumnarDB
IBM Db2
Oracle Database/Exadata
SAP HANA
TeraData
Apache HBase
ClickHouse
Apache Parquet
The above are our commonly used database brands, as well as some niche databases, such as MonetDB, kdb+, etc. So use it early.
Knowing which specific Mac model you have is important.
Please contact with me, that I can tell you about your Mac model and generation will be displayed.
macbook 85w charger,apple macbook 85w charger,85w charger for macbook pro
Shenzhen Waweis Technology Co., Ltd. , https://www.laptopsasdapter.com