sql2008安裝需要的環境,sql中什么時候應用臨時表_SQL數據倉庫環境中的臨時表應用程序

 2023-10-18 阅读 28 评论 0

摘要:sql中什么時候應用臨時表 Today the subject of investigation is the Temporal Table, which is a new feature in SQL Server 2016. My focus will slightly be on how to use it in Data Warehouse environments, but there is some general information passing by as I w

sql中什么時候應用臨時表

Today the subject of investigation is the Temporal Table, which is a new feature in SQL Server 2016. My focus will slightly be on how to use it in Data Warehouse environments, but there is some general information passing by as I write.

今天,調查的主題是臨時表 ,它是SQL Server 2016中的一項新功能。我的重點僅是如何在數據倉庫環境中使用它,但在撰寫本文時會傳遞一些常規信息。

sql2008安裝需要的環境, I want to cover next topics:

我想介紹下一個主題:

  1. What is a temporal table (in short)?

    什么是時間表(簡而言之)?
  2. Can I use a temporal table for a table in the PSA (Persistent Staging Area)?

    我可以在PSA(永久性暫存區)中的表中使用臨時表嗎?
  3. Can I use a temporal table for a Data Vault Satellite?

    我可以將臨時表用于Data Vault Satellite嗎?
  4. Is using temporal tables for full auditing in an OLTP system a good idea?

    使用臨時表在OLTP系統中進行全面審核是否是一個好主意?

什么是時間表(簡而言之)? (What is a temporal table (in short)?)

In short, a temporal table is a table that tracks changes in a (temporal) table to a second “History” table, and this process is managed by SQL Server 2016 itself, you do not have to write extra code for that.

簡而言之,時態表是一個將(時態)表中的更改跟蹤到第二個“歷史”表的表,該過程由SQL Server 2016本身管理,您不必為此編寫額外的代碼。

配置aspsql環境、 Stated in the free eBook Introducing Microsoft SQL Server 2016 is the following:

在免費的介紹Microsoft SQL Server 2016的電子書中規定如下:

“When you create a temporal table, you must include a primary key and non-nullable period columns, a pair of columns having a datetime2 data type that you use as the start and end periods for which a row is valid.”

“當創建時態表時,必須包括一個主鍵和不可為空的期間列,一對具有datetime2數據類型的列,它們用作行的開始和結束期間。”

sql和mysql學哪個, For more details please read the book.

有關更多詳細信息,請閱讀本書。

You can also read more about temporal tables on MSDN.

您還可以在MSDN上閱讀有關時態表的更多信息。

sqli labs、

An example of how a temporal table looks in SQL Server Management Studio. You can see by the icon and the suffix (System Versioned) (both marked red) that this is a temporal table. Underneath the table node is history table is shown (marked with green). Start- and enddatetime are required columns, but you can give them a custom name (marked blue).

SQL Server Management Studio中時態表的外觀示例。 通過圖標和后綴(系統版本化)(均標記為紅色),您可以看到這是一個臨時表。 在表節點下方顯示了歷史記錄表(標記為綠色)。 “開始時間”和“結束日期時間”是必填列,但您可以為其指定一個自定義名稱(標記為藍色)。

There are three ways to create the history table for a temporal table:

sql 日期比較。 有三種創建時間表歷史記錄表的方法:

  1. anonymous history table: you don’t bother really about the history table, SQL Server gives it a name and creates it. 匿名歷史記錄表創建一個臨時表:您不必擔心歷史記錄表,SQL Server為它命名并創建它。
  2. Create a temporal table with a default history table: same as anonymous, but you provide the name for the history table to use.
  3. 使用默認歷史記錄表創建一個時態表:與匿名相同,但是您提供了要使用的歷史記錄表的名稱。
  4. Create a temporal table with an existing history table: you create the history table first, and can optimize storage and/or indexes. Then you create the temporal table and in the CREATE statement provide the name of the existing history table.
  5. 使用現有的歷史記錄表創建時間表 :首先創建歷史記錄表,然后可以優化存儲和/或索引。 然后,創建時間表,并在CREATE語句中提供現有歷史表的名稱。

So the first two are the “lazy” options, and they might be good enough for smaller tables. The third option allows you to fully tweak the history table.

因此,前兩個是“懶惰”選項,它們對于較小的表可能就足夠了。 第三個選項使您可以完全調整歷史記錄表

I have used the third option in my Persistent Staging Area, see below.

sql decimal。 我在持久性暫存區中使用了第三個選項,請參見下文。

我可以在PSA(永久性暫存區)中的表中使用臨時表嗎? (Can I use a temporal table for a table in the PSA (Persistent Staging Area)?)

In my previous blog post – Using a Persistent Staging Area: What, Why, and How – you could read what a Persistent Staging Area (or PSA for short) is.

在我以前的博客文章– 使用持久性暫存區:什么,為什么以及如何 –您可以閱讀什么是持久性暫存區(簡稱PSA)。

Today I want to share my experiences on my lab tests using temporal tables in the PSA.

sql手機環境, 今天,我想分享我在PSA中使用時態表進行實驗室測試的經驗。

But besides a temporal table, I have also created a “normal” staging table, for loading the data. This is because:

但是除了臨時表之外,我還創建了一個“普通”登臺表,用于加載數據。 這是因為:

  1. A temporal table cannot be truncated, and because truncate is much faster than delete, I create a normal staging table to load the data from the source.

    臨時表不能被截斷,并且因為截斷比刪除快得多,所以我創建了一個普通的臨時表來從源中加載數據。
  2. I want load the source data as fast as possible, so I prefer plain insert instead of doing change detection with the rows currently in the temporal table. This would be slower, and I preferably do that later in parallel with loading the rest of the EDW.

    我想盡快加載源數據,所以我更喜歡普通插入,而不是對時態表中的當前行進行更改檢測。 這樣會比較慢,我最好稍后再加載EDW的其余部分時這樣做。
  3. Because I want the PSA to stay optional and not a core part of the EDW. If the PSA is additional to a normal Staging Area, it is easier to switch off later.

    因為我希望PSA保持可選性,而不是EDW的核心部分。 如果PSA是常規暫存區之外的附加組件,則以后更容易關閉。

Here is the script I used to create the temporal table:

sql when。 這是我用來創建時態表的腳本:

?
--\
---) Create the history table ourselves, to be used as a backing table 
---) for a Temporal table, so we can tweak it for optimal performance.
---) Please note that I use the datatype DATETIME2(2) because it
---) uses 6 bytes storage, whereas DATETIME2(7) uses 8 bytes.
---) If the centiseconds precision of DATETIME2(2) is not enough
---) in your data warehouse, you can change it to DATETIME2(7).
--/
CREATE TABLE [psa].[Customer_TemporalHistory]
(????[CustomerID] INT NOT NULL,[FirstName] NVARCHAR(20) NULL,[Initials] NVARCHAR(20) NULL,[MiddleName] NVARCHAR(20) NULL,[SurName] NVARCHAR(50) NOT NULL,[DateOfBirth] DATE NOT NULL,[Gender] CHAR(1) NOT NULL,[SocialSecurityNumber] CHAR(12) NOT NULL,[Address] NVARCHAR(60) NOT NULL,[PostalCode] CHAR(10) NULL,[Residence] NVARCHAR(60) NULL,[StateOrProvince] NVARCHAR(20) NULL,[Country] NVARCHAR(60) NULL,[RowHash] BINARY(16), [SessionStartDts] DATETIME2(2) NOT NULL, [EffectiveStartDts] DATETIME2(2) NOT NULL, [EffectiveEndDts] DATETIME2(2) NOT NULL
);
GO--\
---) Add indexes to history table
--/
CREATE CLUSTERED COLUMNSTORE INDEX [IXCS_Customer_TemporalHistory]ON [psa].[Customer_TemporalHistory];CREATE NONCLUSTERED INDEX [IXNC_Customer_TemporalHistory__EffectiveEndDts_EffectiveStartDts_CustomerID]ON [psa].[Customer_TemporalHistory] ([EffectiveEndDts], [EffectiveStartDts], [CustomerID]);GO--\
---) Now create the temporal table
--/
CREATE TABLE [psa].[Customer_Temporal]
(????[CustomerID] INT NOT NULL,[FirstName] NVARCHAR(20) NULL,[Initials] NVARCHAR(20) NULL,[MiddleName] NVARCHAR(20) NULL,[SurName] NVARCHAR(50) NOT NULL,[DateOfBirth] DATE NOT NULL,[Gender] CHAR(1) NOT NULL,[SocialSecurityNumber] CHAR(12) NOT NULL,[Address] NVARCHAR(60) NOT NULL,[PostalCode] CHAR(10) NULL,[Residence] NVARCHAR(60) NULL,[StateOrProvince] NVARCHAR(20) NULL,[Country] NVARCHAR(60) NULL,[RowHash] BINARY(16), [SessionStartDts] DATETIME2(2) NOT NULL,-- SessionStartDts is manually set, and is the same for all -- rows of the same session/loadcycle.CONSTRAINT [PK_Customer_Temporal]PRIMARY KEY CLUSTERED??([CustomerID] ASC), [EffectiveStartDts] DATETIME2(2) GENERATED ALWAYS AS ROW START NOT NULL,[EffectiveEndDts] DATETIME2(2) GENERATED ALWAYS AS ROW END NOT NULL,PERIOD FOR SYSTEM_TIME ([EffectiveStartDts], [EffectiveEndDts]) 
)
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = [psa].[Customer_TemporalHistory])); 
GO-- Add a few indexes.
CREATE NONCLUSTERED INDEX [IXNC_Customer_Temporal__EffectiveEndDts_EffectiveStartDts_CustomerID]ON [psa].[Customer_Temporal] ([EffectiveEndDts], [EffectiveStartDts], [CustomerID]);CREATE NONCLUSTERED INDEX [IXNC_Customer_Temporal__CustomerID_RowHash]ON [psa].[Customer_Temporal] ([CustomerID], [RowHash]);
GO

Note: I have not included all the scripts that I used for my test in this article, because it could be overwhelming. But if you are interested you can download all the scripts and the two SSIS testpackages here.

注意:我沒有在本文中包括用于測試的所有腳本,因為這可能會讓人感到不知所措。 但是,如果您有興趣,可以在此處下載所有腳本和兩個SSIS測試包。

Maybe you are as curious as me to know if using temporal tables for a PSA is a good idea.

也許您和我一樣好奇, 對于PSA使用臨時表是否是個好主意。

Considerations are the following:

注意事項如下:

  • Speed of data 加載 loading. 速度
  • Speed of 讀取行的reading rows at a certain moment in time (time travel mechanism). 速度 (時間旅行機制)。
  • Ability to adopt changes in the datamodel.
  • 能夠采用數據模型中的更改
  • Simplicity and reliability of the solution 簡單性和可靠性
  • Ability to do historic loads, for instance from archived source files or an older data warehouse.
  • 能夠執行歷史性負載 ,例如從存檔的源文件或較舊的數據倉庫中進行負載

And of course we need something to compare with. Let that be a plain SQL Server table with a start- and enddatetime.

當然,我們需要一些可以比較的東西。 讓它成為一個帶有開始日期和結束日期時間的普通SQL Server表。

Before I present you the testresults, I just want to tell a few details about the test:

在向您介紹測試結果之前,我只想告訴您有關測試的一些詳細信息:

  • For testing I use a “Customer” table that is filled with half a million rows of dummy data.

    為了進行測試,我使用了一個“客戶”表,其中填充了半百萬行的虛擬數據。
  • I simulate 50 daily loads with deletes, inserts and updates in the staging table. After those 50 runs, the total number of rows has more than quadrupled to just over 2.2 million (2237460 to be exactly).

    我在登臺表中模擬了50次每日負載,包括刪除,插入和更新。 經過這50次運行后,總行數增加了三倍多,達到220萬以上(準確地說是2237460)。
  • For the DATETIME2 columns, I use a precision of centiseconds, so DATETIME2(2). For justification see one of my older blog posts: Stop being so precise! and more about using Load(end)dates (Datavault Series). If needed you can use a higher precision for your data warehouse.
  • 對于DATETIME2列,我使用的精度為厘秒,因此DATETIME2(2)。 出于正當理由,請參閱我的較早的博客文章之一: 別這么精確! 以及有關使用Load(end)dates(Datavault系列)的更多信息 。 如果需要,可以為數據倉庫使用更高的精度。
  • [RowHash] column, which is a MD5 hashvalue of all columns of a row that are relevant for a change (so start- and enddate are not used for the hashvalue). This is done primarily for having a better performance while comparing new and existing rows. [RowHash]列,它是與更改相關的行的所有列的MD5哈希值(因此,哈希值不使用開始日期和結束日期)。 這樣做主要是為了在比較新行和現有行時具有更好的性能。
  • I have compared all data in both the Non temporal PSA table and the temporal table with backing history table to check that the rows where exactly the same and this was the case (except for Start- and Enddates).

    我已將非臨時PSA表和臨時表與支持歷史記錄表中的所有數據進行了比較,以檢查行是否完全相同,并且確實如此(開始日期和結束日期除外)。

數據加載速度 (Speed of data loading)

Using T-SQL for synchronizing data from a staging table to a PSA table I got the following testresults:

使用T-SQL將數據從登臺表同步到PSA表時,我得到以下測試結果:

?Testcase Average duration (50 loads, in ms)
?Synchronize PSA temporal 6159
?Synchronize PSA Non-temporal 24590
測試用例 平均持續時間(50次加載,以毫秒為單位)
同步PSA時間 6159
同步PSA非時間 24590

So we have a winner here, it’s the temporal table! It’s four times faster!

所以我們在這里有一個贏家,這是臨時表! 快四倍!

在某一時刻讀取行的速度(時間移動機制) (Speed of reading rows at a certain moment in time (time travel mechanism))

For reading, I used two views and two SSIS Packages with a time travel mechanism and a data flow task.

為了閱讀,我使用了兩個視圖和兩個SSIS包以及一個時間旅行機制和一個數據流任務。

The views return the rows valid at a certain point in time, selected from the temporal and non-temporal history table, respectively.

視圖返回分別在時間點和非時間歷史表中選擇的在特定時間點有效的行。

The data flow tasks in the SSIS packages have a conditional split that is used to prevent that the rows actually are inserted into the OLE DB Destination. In this way it is a more pure readtest.

SSIS程序包中的數據流任務具有條件拆分,該條件拆分用于防止將行實際插入到OLE DB目標中。 這樣,它是一個更純粹的重新測試。

Here are the views that were used:

以下是使用的視圖:

?
--\
---) For the demo, the virtualization layer is slightly different from a real life scenario.
---) Normally you would create separate databases for the Staging Area and the PSA, so you could do
---) the connection swap as explained here: 
---) http://www.hansmichiels.com/2017/02/18/using-a-persistent-staging-area-what-why-and-how/
---) Normally you would also either have a normal history table or a temporal one, but not both.
---) As I now have three objects that would like to have the virtual name [stg].[Customer], I use a 
---) suffix for the PSA versions, as this is workable for the demo.
---) So:??
---) [stg].[Customer]: view on the normal [stg_internal].[Customer] table (only in downloadable materials).
---) [stg].[Customer_H]: view on the [psa].[Customer_History] table.
---) [stg].[Customer_TH]: view on the [psa].[Customer_Temporal] table.
--/-------------- [Customer_H] --------------
IF OBJECT_ID('[stg].[Customer_H]', 'V') IS NOT NULL DROP VIEW [stg].[Customer_H];
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GOCREATE VIEW [stg].[Customer_H]
AS
/*
==========================================================================================
Author:??????Hans Michiels
Create date: 15-FEB-2017
Description: Virtualization view in order to be able to to full reloads from the PSA.
==========================================================================================
*/
SELECT????????hist.[CustomerID],hist.[FirstName],hist.[Initials],hist.[MiddleName],hist.[SurName],hist.[DateOfBirth],hist.[Gender],hist.[SocialSecurityNumber],hist.[Address],hist.[PostalCode],hist.[Residence],hist.[StateOrProvince],hist.[Country],hist.[RowHash],hist.[SessionStartDts]
FROM????[psa].[PointInTime] AS pit
JOIN[psa].[Customer_History] AS histON??hist.EffectiveStartDts <= pit.CurrentPointInTimeAND hist.EffectiveEndDts > pit.CurrentPointInTime 
GO-------------- [Customer_TH] --------------
IF OBJECT_ID('[stg].[Customer_TH]', 'V') IS NOT NULL DROP VIEW [stg].[Customer_TH];
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GOCREATE VIEW [stg].[Customer_TH]
AS
/*
==========================================================================================
Author:??????Hans Michiels
Create date: 15-FEB-2017
Description: Virtualization view in order to be able to to full reloads from the PSA.
==========================================================================================
*/SELECT????????hist.[CustomerID],hist.[FirstName],hist.[Initials],hist.[MiddleName],hist.[SurName],hist.[DateOfBirth],hist.[Gender],hist.[SocialSecurityNumber],hist.[Address],hist.[PostalCode],hist.[Residence],hist.[StateOrProvince],hist.[Country],hist.[RowHash],hist.[SessionStartDts]
FROM????[psa].[PointInTime] AS pit
JOIN[psa].[Customer_Temporal] FOR SYSTEM_TIME ALL AS hist-- "FOR SYSTEM_TIME AS OF" does only work with a constant value or variable, -- not by using a column from a joined table, e.g. pit.[CurrentPointInTime]-- So unfortunately we have to select all rows, and then do the date logic ourselves---- Under the hood, a temporal table uses EXCLUSIVE enddating:-- the enddate of a row is equal to the startdate of the rows that replaces it.-- Therefore we can not use BETWEEN, as this includes the enddatetime.ON??hist.EffectiveStartDts <= pit.CurrentPointInTimeAND hist.EffectiveEndDts > pit.CurrentPointInTime -- By the way, there are more ways to do this, you could also use a CROSS JOIN and-- a WHERE clause here, instead doing the datetime filtering in the join.
GO


For measuring the duration I simply used my logging framework (see A Plug and Play Logging Solution) and selected the start- and enddatetime of the package executions from the [logdb].[log].[Execution] table.

為了測量持續時間,我只使用了我的日志記錄框架(請參閱即插即用日志記錄解決方案 ),然后從[logdb]。[log]。[Execution]表中選擇了包執行的開始日期和結束日期。

Here are the results:

結果如下:

?Testcase Total duration (50 reads, in seconds)
?Read PSA temporal 164
Read PSA Non-temporal 2686
測試用例 總持續時間(50次讀取,以秒為單位)
閱讀PSA時態 164
閱讀非臨時性PSA 2686

And again, a very convincing winner, it is the temporal table again. It is even 16 times faster! I am still wondering how this is possible. Both tables have similar indexes, of which one columnstore, and whatever I tried, I kept getting the same differences.

再一次,非常令人信服的獲勝者,這是臨時表。 它甚至快了16倍! 我仍然想知道這怎么可能。 這兩個表都有相似的索引,其中一個列存儲,而無論我嘗試什么,我一直得到相同的差異。

能夠采用數據模型中的更改 (Ability to adopt changes in the datamodel)

Change happens. So if a column is deleted or added in the source, we want to make a change in the PSA:

變化發生了。 因此,如果在源中刪除或添加了列,我們要對PSA進行更改:

  • if a column is deleted, we make keep in the PSA to retain history (and make it NULLABLE when required).

    如果刪除了列,我們將保留在PSA中以保留歷史記錄(并在需要時將其設為NULLABLE)。
  • if a column is added, we also add a column.

    如果添加了列,我們還將添加一列。

I have tested the cases above plus the deletion of a column for the temporal table.

我已經測試了上述情況,并刪除了時態表的列。

And yes, this works. You only have to change the temporal table (add, alter or drop column), the backing history table is changed automaticly by SQL Server.

是的,這可行。 您只需要更改臨時表(添加,更改或刪除列),備份歷史記錄表就會由SQL Server自動更改。

There are however a few exceptions when this is not the case, e.g. IDENTITY columns. You can read more about this on MSDN.

但是,如果不是這種情況,也有一些例外,例如IDENTITY列。 您可以在MSDN上閱讀有關此內容的更多信息。

?
--\
---) Adapt Changes
---) After running this script, most other scripts are broken!!
--/--\
---) Add a column
--/
-- Staging table
ALTER TABLE [stg_internal].[Customer] ADD [CreditRating] CHAR(5) NULL;
GO-- Non-temporal history table
ALTER TABLE [psa].[Customer_History] ADD [CreditRating] CHAR(5) NULL;
GO-- Temporal history table and it's backing table.
ALTER TABLE [psa].[Customer_Temporal] ADD [CreditRating] CHAR(5) NULL;-- Not needed, SQL Server will do this behind the scenes:
-- ALTER TABLE [psa].[Customer_TemporalHistory] ADD [CreditRating] CHAR(5) NULL;GO--\
---) Make a column NULLABLE
--/
-- Staging table
ALTER TABLE [stg_internal].[Customer] ALTER COLUMN [SocialSecurityNumber] CHAR(12) NULL;
GO-- Non-temporal history table
ALTER TABLE [psa].[Customer_History] ALTER COLUMN [SocialSecurityNumber] CHAR(12) NULL;
GO-- Temporal history table and it's backing table.
ALTER TABLE [psa].[Customer_Temporal] ALTER COLUMN [SocialSecurityNumber] CHAR(12) NULL;-- Not needed, SQL Server will do this behind the scenes:
-- ALTER TABLE [psa].[Customer_TemporalHistory] ALTER COLUMN [SocialSecurityNumber] CHAR(12) NULL;GO--\
---) Delete a column (not adviced since you will then lose history).
--/
-- Staging table
ALTER TABLE [stg_internal].[Customer] DROP COLUMN [StateOrProvince];
GO-- Non-temporal history table
ALTER TABLE [psa].[Customer_History] DROP COLUMN [StateOrProvince];
GO-- Temporal history table and it's backing table.
ALTER TABLE [psa].[Customer_Temporal] DROP COLUMN [StateOrProvince];GO

Temporal table with added column “CreditRating”: when added to the temporal table the column is also automaticly added to the backing history table. (I removed some other columns from the picture for simplicity)

帶有添加的列“ CreditRating”的時態表:添加到時態表時,該列也會自動添加到支持歷史記錄表中。 (為簡單起見,我從圖片中刪除了其他一些列)

But the conclusion is that a temporal table structure can be changed when needed. This is what I wanted to know.

但是結論是,可以在需要時更改時態表結構。 這就是我想知道的。

解決方案的簡單性和可靠性 (Simplicity and reliability of the solution)

Unless you use code generation tools that generate the loading process for you, and the code that comes out is thoroughly tested, I would say the code to keep track of changing using a temporal table is less complex and thus less prone to errors. Especially the enddating mechanism is handled by SQL Server, and that sounds nice to me.

除非您使用代碼生成工具為您生成加載過程,并且未對輸出的代碼進行全面測試,否則我要說的是使用臨時表跟蹤更改的代碼不會那么復雜,因此不太容易出錯。 特別是終結機制是由SQL Server處理的,這對我來說聽起來不錯。

There is however also a disadvantage of using a temporal table: the start- and end-datetime are out of your control, SQL Server gives it a value and there is nothing you can do about that. For Data Vault loading it is a common best practice to set the LoadDts of a Satellite to the same value for the entire load and you could defend that this would also be a good idea for a PSA table.

但是,使用時態表也有一個缺點:開始日期時間和結束日期時間不在您的控制范圍內,SQL Server為它提供了一個值,您對此無能為力。 對于Data Vault加載,通常的最佳做法是將整個加載的Satellite的LoadDts設置為相同的值,您可以辯稱,這對于PSA表也是個好主意。

But, as you might have noticed, my solution for that is to just add a SessionStartDts to the table in addition to the start and end Dts that SQL Server controls. I think this is an acceptable workaround.

但是,正如您可能已經注意到的那樣,我的解決方案是除了將SQL Server控制的開始和結束Dts之外,還向添加SessionStartDts。 我認為這是可以接受的解決方法。

By the way, SQL Server always uses the UTC date for start and end datetimes of a temporal table, keep that in mind!

順便說一句,SQL Server始終將UTC日期用于臨時表的開始和結束日期時間,請記住這一點!

能夠進行歷史負荷 (Ability to do historic loads)

For this topic I refer to Data Vault best practices again. When using Data Vault, the LoadDateTimeStamp always reflects the current system time except when historic data is loaded: then the LoadDateTimeStamp is changed to the value of the (estimated) original date/time of the delivery of the datarow.

對于本主題,我再次參考Data Vault最佳實踐。 使用Data Vault時, 除非加載歷史數據 ,否則LoadDateTimeStamp始終反映當前系統時間:然后將LoadDateTimeStamp更改為數據行交付的(估計)原始日期/時間的值。

This can be a bit problematic when you use a PSA with system generated start and end dates, at least that is what I thought for a while. I thought this was spoiling all the fun of the temporal table.

當您將PSA與系統生成的開始日期和結束日期一起使用時,這可能會有點問題,至少這是我一段時間以來的想法。 我認為這破壞了臨時表的所有樂趣。

But suddenly I realized it is not!

但是突然我意識到事實并非如此!

Let me explain this. Suppose you have this staging table SessionStartDts (or LoadDts if you like) for which you provide the value.

讓我解釋一下。 假設您有此登臺表SessionStartDts (如果需要,則為LoadDts),并為其提供值。

Besides that you have the EffectiveStartDts and EffectiveEndDts (or whatever name you give to these columns) of the temporal table that SQL Server controls.

除此之外,您還具有SQL Server控制的時態表的EffectiveStartDtsEffectiveEndDts (或為這些列提供的任何名稱)。

Be aware of the role that both “timelines” must play:

注意兩個“時間表”必須扮演的角色:

  • only used to select the staging rows at a point in time. They are ignored further down the way into the EDW. 用于選擇某個時間點的暫存行。 在進入EDW的途中,它們將被忽略。
  • SessionStartDts, which can be set to a historic date/time, is used further down the way into the EDW to do the enddating of satellites and so on. SessionStartDts (可以設置為歷史日期/時間)在進入EDW的過程中進一步用于完成衛星的終結等等。

How this would work? As an example a view that I used for the readtest, which contain both the [SessionStartDts] and the [PointInTimeDts] (for technical reasons converted to VARCHAR). The math to get the right rows out works on the ‘technical timeline’ (SQL Server controlled columns), while the [SessionStartDts] is available later for creating timelines in satellites.

這將如何工作? 作為示例,我使用了一個用于readtest的視圖,該視圖同時包含[SessionStartDts]和[PointInTimeDts](出于技術原因,已轉換為VARCHAR)。 在“技術時間軸”(SQL Server控制的列)上進行正確排行的數學運算,而[SessionStartDts]稍后可用于在衛星中創建時間軸。

?
CREATE VIEW [psa].[Timeline_H]
AS
/*
==========================================================================================
Author:??????Hans Michiels
Create date: 15-FEB-2017
Description: View used for time travelling.
==========================================================================================
*/
SELECT TOP 2147483647??????CONVERT(VARCHAR(30), alltables_timeline.[SessionStartDts], 126) AS [SessionStartDtsString],-- If the point in time is the maximum value of the [EffectiveStartDts] of the applicable [SessionStartDts]-- you will select all rows that were effective/valid after this session/load.CONVERT(VARCHAR(30), MAX(alltables_timeline.[EffectiveStartDts]), 126) AS [PointInTimeDtsString]
FROM(SELECT DISTINCT subh.[SessionStartDts], subh.[EffectiveStartDts]FROM[psa].[Customer_History] subh WITH (READPAST)-- UNION MORE STAGING TABLES HERE WHEN APPLICABLE) alltables_timeline
GROUP BYCONVERT(VARCHAR(30), alltables_timeline.[SessionStartDts], 126)
ORDER BY CONVERT(VARCHAR(30), alltables_timeline.[SessionStartDts], 126)
GOCREATE VIEW [psa].[Timeline_TH]
AS
/*
==========================================================================================
Author:??????Hans Michiels
Create date: 15-FEB-2017
Description: View used for time travelling.
==========================================================================================
*/
SELECT TOP 2147483647????CONVERT(VARCHAR(30), alltables_timeline.[SessionStartDts], 126) AS [SessionStartDtsString],-- If the point in time is the maximum value of the [EffectiveStartDts] of the applicable [SessionStartDts]-- you will select all rows that were effective/valid after this session/load.CONVERT(VARCHAR(30), MAX(alltables_timeline.[EffectiveStartDts]), 126) AS [PointInTimeDtsString]
FROM(SELECT DISTINCTsubh.[SessionStartDts], subh.[EffectiveStartDts]FROM[psa].[Customer_Temporal] FOR SYSTEM_TIME ALL AS subh WITH (READPAST)-- UNION MORE STAGING TABLES HERE WHEN APPLICABLE) alltables_timeline
GROUP BYCONVERT(VARCHAR(30), alltables_timeline.[SessionStartDts], 126)
ORDER BY CONVERT(VARCHAR(30), alltables_timeline.[SessionStartDts], 126)
GO

得出有關將時序表用于PSA的結論 (Drawing a conclusion about using temporal tables for a PSA)

?Consideration And the winner is ..
?Speed of data loading Temporal table (4 times faster!)
?Speed of reading rows at a certain moment in time (time travel mechanism) Temporal table (16 times faster!)
?Ability to adopt changes in the datamodel Ex aequo (only in exceptional cases changing the temporal table is more complex).
? Simplicity and reliability of the solution Temporal table.
?Ability to do historic loads Ex aequo, if you know what you are doing.
考慮 最終獲勝者是 ..
數據加載速度 時態表(快4倍!)
在某一時刻讀取行的速度(時間移動機制) 時態表(快16倍!)
能夠采用數據模型中的更改 衡平法(僅在特殊情況下更改時態表更為復雜)。
解決方案的簡單性和可靠性 時間表。
能夠進行歷史負荷 當然,如果您知道自己在做什么。

I think there are enough reasons for using temporal tables for a PSA! Do you agree?

我認為有足夠的理由將臨時表用于PSA! 你同意嗎?

我可以將臨時表用于Data Vault Satellite嗎? (Can I use a temporal table for a Data Vault Satellite?)

Due to the similarities with a table in the Persistent Staging Area, I think those test results on read- and write performance also hold true for satellites.

由于與“永久暫存區”中的表格相似,我認為那些有關讀寫性能的測試結果也適用于衛星。

However in satellites you cannot get away with the system generated start- and enddatetimestamps when you have to deal with historic loads, unless you do serious compromises on the technical design.

但是,在衛星中當您必須處理歷史負載時, 您無法擺脫系統生成的開始日期和結束日期時間戳除非您在技術設計上進行了重大妥協

What does not work is removing SYSTEM_VERSIONING temporarily ( ALTER TABLE [psa].[Customer_temporal] SET (SYSTEM_VERSIONING = OFF)) and update the dates then. Because the columns are created as GENERATED ALWAYS this is not allowed.

不起作用的是暫時刪除SYSTEM_VERSIONING (ALTER TABLE [psa]。[Customer_temporal] SET(SYSTEM_VERSIONING = OFF)) ,然后更新日期。 因為這些列是作為“ 始終生成的”創建的,所以不允許這樣做。

Besides that, this would be a clumsy solution that still requires manual management of the timeline in a more complex way than when normal satellites were used!

除此之外,這將是一個笨拙的解決方案,與使用普通衛星相比,它仍然需要以更復雜的方式手動管理時間軸!

So that leaves only one other solution, which requires – as said – a serious compromise on the technical design.

這樣就只剩下一個解決方案,如上所述,這需要對技術設計進行認真的折衷。

If you make the use of point in time tables mandatory for every hub and its satellites, you could decouple historical and technical timelines. Using a similar mechanism as the view for time travelling, you could attach the point in time date 2013-02-02 (historical) to the EffectiveStartDts (or LoadDts if you like) of 2017-02-28 06:34:42.98 (technical date from temporal table) of a certain satellite row.

如果對每個集線器及其衛星強制使用時間表中的時間點 ,則可以將歷史和技術時間表脫鉤。 使用與時間旅行視圖類似的機制,可以將時間點日期2013-02-02(歷史)附加到2017-02-28 06:34:42.98(技術某個衛星行的時間表中的日期)。

And .. if you follow the holy rule that the Business Vault (in which the point in time tables exist) should always be rebuildable from the Raw Vault, you must also store the historical Startdate as an additional attribute in the satellite, but you exclude it for change detection.

并且..如果您遵循始終應從原始保管庫中重建業務保管庫(存在時間表的點)的神圣原則,則還必須將歷史開始日期作為附加屬性存儲在衛星中,但要排除在外用于更改檢測。

Is it worth this sacrifice in order to be able to use temporal tables?

為了能夠使用臨時表,是否值得為此付出犧牲?

I don’t know, “it depends”. It feels like bending the Data Vault rules, but at least it can be done, keep that in mind.

我不知道,“取決于”。 感覺就像彎曲Data Vault規則,但至少可以做到,記住這一點。

使用臨時表在OLTP系統中進行全面審核是否是一個好主意? (Is using temporal tables for full auditing in an OLTP system a good idea?)

When auditing is needed due to legislation or internal audit requirements, I certainly think it is a good idea to use temporal tables. They are transparent to front end applications that write to the database and the performance seems quite okay (see above). Obviously the performance will always be a bit worse than non-temporal tables in an OLTP scenario, but that is not unique for temporal tables. Every solution to track history will cost performance.

當由于立法或內部審計要求而需要審計時,我當然認為使用臨時表是一個好主意。 它們對寫入數據庫的前端應用程序是透明的,并且性能似乎還不錯(請參見上文)。 顯然,在OLTP場景中,性能始終會比非臨時表差一些,但這對于臨時表并不是唯一的。 跟蹤歷史記錄的每種解決方案都會降低性能。

結論/總結 (Conclusion / Wrap up)

In this article I discussed some possible applications for the temporal table, a new feature in SQL Server 2016.

在本文中,我討論了時態表的一些可能應用程序,這是SQL Server 2016中的一項新功能。

And it can be used for PSA (Persistent Staging Area) tables, Data Vault Satellites and tables in OLTP systems. If you know what you are doing, temporal tables can be of great value. That’s at least what I think.

它可用于PSA(永久暫存區)表,Data Vault Satellite和OLTP系統中的表。 如果您知道自己在做什么,則臨時表可能會很有價值。 至少我是這樣想的。

網絡資源 (Resources on the web)

  • Free ebook: introducing Microsoft SQL Server 2016 (on Microsoft web site) 免費電子書:介紹Microsoft SQL Server 2016(在Microsoft網站上)
  • Temporal tables (MSDN). 時態表(MSDN )。
  • Changing the Schema of a System-Versioned temporal table (MSDN) 更改系統版本時間表(MSDN)的架構
  • Using a Persistent Staging Area: What, Why, and How (blog post) 使用持久性暫存區:什么,為什么和如何(博客文章)
  • Stop being so precise! and more about using Load(end)dates (blog post)
  • 別這么精確! 以及更多關于使用Load(end)dates的信息(博客文章)
  • A Plug and Play Logging Solution (blog post) 即插即用日志記錄解決方案(博客文章)

And again. if you are interested you can download all scripts and SSIS Packages used for my test here, also the ones not published inline in this article.

然后再次。 如果您有興趣,可以在此處下載用于我的測試的所有腳本和SSIS包,以及本文中未內聯發布的所有腳本和SSIS包。

翻譯自: https://www.sqlshack.com/temporal-table-applications-in-sql-data-warehouse-environments/

sql中什么時候應用臨時表

版权声明:本站所有资料均为网友推荐收集整理而来,仅供学习和研究交流使用。

原文链接:https://hbdhgg.com/4/144565.html

发表评论:

本站为非赢利网站,部分文章来源或改编自互联网及其他公众平台,主要目的在于分享信息,版权归原作者所有,内容仅供读者参考,如有侵权请联系我们删除!

Copyright © 2022 匯編語言學習筆記 Inc. 保留所有权利。

底部版权信息