a
I am a highly motivated and organised professional with more than ten years of experience as a Database Specialist and Architect or designer.
Bring Me a Coffee - NZ$ 5

Author: Data Management Engineer

Tourism recovery mode

It’s wonderful to hear that LinkedIn Editors invited me! Pursuing a passion is an incredible thing. At this time, I have been invited by Mitchell Van Homrigh to contribute about how tech can solve tourism pain points. Below, you can find the points that woke up my interest.

 

Like many others, the tourism sector has been significantly impacted by the COVID-19 pandemic (it looks like that pharmaceutic ambition planned to destroy everything in its path, with the unique intention to get a couple of millions of dollars more in its revenue, another discussion for late). As we move towards recovery, technology could play a crucial role in reviving the industry. From my perspective, as a Data Management Engineer, I would like to highlight two things. Virtual Reality and Data Analytics.

  1. Virtual Reality (VR) Tours: The airline industry and people are recovering their health and “boom” stamina. For those planning tourism to any country or airport, VR tours can provide immersive experiences. The idea is to offer a VR tour of the destination airport, along with information on what to do before or after disembarking or boarding, such as where to go, what documentation is required, and emergency services information. When travelling to another country, it’s easy to feel lost because you don’t know how far the connection flight or migration/customs procedure is. VR tours can help passengers be more agile and reduce their time in the airport, improving the customer experience.

  1. Data Analytics: Tourism businesses can gain insight into customer behaviour and preferences by utilising data analytics to create more personalised experiences. This can lead to various benefits, such as predicting demand, improving pricing availability, and optimizing inventory for financial optimisation. In fact, a study showed that nine hotels experienced a 22% average increase in revenue after implementing pricing optimisation. Additionally, data analytics can inform funding decisions and shape business development based on detailed user preferences. By identifying potential customers at different stages of the trip planning process, analytics can help businesses target specific groups. Finally, data analytics can improve marketing campaign effectiveness by identifying the best channel to reach customers.

The innovative concept of virtual reality tours can potentially transform how we explore airports and destinations. In a world recovering from a pandemic, these immersive experiences offer a much-needed solution for travellers seeking a hassle-free and efficient journey. Imagine arriving in a new country and embarking on a virtual reality tour that guides you through customs procedures, connecting flights, and emergency services. This game-changing technology has the power to revolutionise the travel industry and create an even more satisfying experience for customers.

Regarding data, Analytics acts as a compass for tourist companies, guiding them in the right direction. By harnessing the power of data, businesses can gain valuable insights into customer behaviors and preferences, allowing them to create personalized experiences. Data analytics is the key ingredient for achieving operational efficiency and financial success, from predicting demand and optimizing pricing to shaping business strategies based on user preferences. This is a theoretical concept and a proven technique for boosting revenue and engaging consumers.

In summary, the COVID-19 outbreak had a major impact on the tourism industry, challenging its ability to adapt and recover. Looking ahead, it is clear that technology can play a critical role in supporting the industry. As a data management engineer, I believe virtual reality (VR) and data analytics are promising game changers. In this post-pandemic era, where the race to recovery is fierce, those who leverage technology wisely will survive and thrive, delivering the best service and experiences to passengers worldwide. The future of tourism is tech-infused, and it’s time to take that leap into a brighter, more innovative tomorrow.

 

Select AI – Natural Language to SQL Generation on Autonomous Database

Autonomous Database Select AI, which enables you to easily retrieve data using natural language. Combining generative AI with Oracle SQL allows you to express your desired outcome and let the database generate the appropriate SQL query for your schema. While some language models may excel at generating SQL, executing it on your database can be a challenge. Select AI streamlines the process by generating SQL tailored to your specific database.

Large language models (LLMs) powered by AI are trained on vast amounts of text data and can understand the nuances and intended meaning of most natural language inquiries. By using natural language to create SQL, the time taken to generate queries is reduced, query construction is simplified, and the need for specialized SQL expertise is minimized or eliminated. Obtaining information from your database that involves creating queries with multiple joins, nested subqueries, and other SQL structures becomes easier and faster when using plain language.

By learning effective SQL query patterns from curated training data, LLMs can produce more efficient queries – enabling them to perform better. As part of Oracle Autonomous Database, Select AI inherits all security and authentication features of the database.

By virtue of being integrated with Oracle SQL, this capability is available through any SQL IDE, SQL Developer Web, Oracle Application Express (APEX), and Oracle Machine Learning Notebooks. Any application that invokes SQL and has an AI provider account also has access to Select AI.

Large Language are Insufficient

Certain LLMs excel at generating SQL code and are capable of accurately interpreting the user’s intentions. These LLMs can even generate table and field names that fit within the syntax of a valid SQL query. However, it’s important to note that this query will only work if your database has pre-existing tables with columns. Essentially, the LLM is able to imagine the necessary tables and fields in order to generate the SQL code.

To generate a runnable query, we must first tie it to a certain schema, which is where Select AI comes in. By creating a profile, you can use your local schema by default, but you can also indicate which schema(s) and table(s) you wish to be considered when producing queries.

The initial natural language query is enhanced with metadata from schema(s) found in the user’s profile, as shown below. Feeding information to the LLM allows it to generate a SQL query that can be executed against your database and, hopefully, offer the results you’re searching for.

DBMS_CLOUD_AI Package

Enabling this feature is a new package, DBMS_CLOUD_AI, in Autonomous Database that enables the use of LLMs for generating SQL from natural language prompts. The package provides access to user specified LLMs along with knowledge of the user’s accessible database metadata. This enables producing runnable SQL queries applicable to that schema. The DBMS_CLOUD_AI package currently integrates with AI providers such as OpenAI and Cohere, with others planned to follow. To use this feature, you need an account with the AI provider and must supply your API credentials to Autonomous Database.

Getting started

To get started with Select AI, sign into your Autonomous Database instance with administrator privileges and add your user (here MY_USER) to the ACL list and grant access to the DBMS_CLOUD_AI package:

Copy Code
BEGIN DBMS_NETWORK_ACL_ADMIN.APPEND_HOST_ACE( host => ‘api.openai.com’, ace => xs$ace_type(privilege_list => xs$name_list(‘http’), principal_name => ‘MY_USER’, principal_type => xs_acl.ptype_db) ); END; grant execute on DBMS_CLOUD_AI to MY_USER;

Create a database credential to your AI provider account. In the case of OpenAI, the password is the uniquely generated token for API usage.

Copy Code
BEGIN DBMS_CLOUD.DROP_CREDENTIAL ( credential_name => ‘OPENAI_CRED’); DBMS_CLOUD.CREATE_CREDENTIAL( credential_name => ‘OPENAI_CRED’, username => ‘OPENAI’, password => ‘xyz’ ); END;

Create your DBMS_CLOUD_AI profile. Here, we list the schema(s) we want to be considered (e.g., SH) and, optionally, any tables (e.g., customers, sales, etc.).

Copy Code
BEGIN DBMS_CLOUD_AI.drop_profile(profile_name => ‘OPENAI’); DBMS_CLOUD_AI.create_profile( profile_name => ‘OPENAI’, attributes => ‘{“provider”: “openai”, “credential_name”: “OPENAI_CRED”, “object_list”: [{“owner”: “SH”, “name”: “customers”}, {“owner”: “SH”, “name”: “sales”}, {“owner”: “SH”, “name”: “products”}, {“owner”: “SH”, “name”: “countries”}] }’); END;

Set your DBMS_CLOUD_AI profile, which must be done in each database session.

Copy Code
BEGIN DBMS_CLOUD_AI.SET_PROFILE( profile_name => ‘OPENAI’ ); END;

Very Soon I will put more information to complement this new feature.

Skill No 4 – Performance and Tunning

Introduction

In the world of database management, ensuring optimal performance is crucial for businesses to operate efficiently. In the “Diagnosing and Tuning Database Performance” section, we delve into the various factors involved in fine-tuning and optimising a database instance.

Database performance tuning involves analysing and optimizing different aspects of a database system to enhance its speed, efficiency, and reliability. It encompasses areas such as query optimisation, index usage, hardware configuration, memory management, and more.

By addressing these factors systematically, businesses can experience improved response times, reduced downtime, enhanced scalability, and ultimately better user experiences.

Whether you are an experienced database administrator or someone new to the database management field, I believe this blog and info will provide valuable insights into effectively diagnosing and tuning your database’s performance.

Stay tuned, comment something down here to keep your contact details as we uncover practical tips and techniques that can help you optimize your database instance for maximum efficiency.

If you’re looking for more engaging content like this, the best way is to get in contact with me directly; flick a message to datamanagemente@gmail.com

What Is Database Monitoring?

Database monitoring allows companies to keep tabs on network and database performance. It refers to a set of specific tasks performed regularly to ensure a database functions as it should. This includes both hardware and software maintenance. Database monitoring is like an ongoing health check-up for your data management system (DMS).

A DMS will collect vital data so the user can perform important tasks like:

  • Tracking index and table usage
  • Figuring out why the system is performing poorly
  • Testing the impact of various changes like modified queries
  • Identifying weak spots and bottlenecks so they can be optimised
  • Tracking the performance of specific SQL queries or applications
  • Forecasting what hardware will be needed based on how a database is used

Performance and Tuning in Database Engines: Oracle, SQL Server, and the Role of AI, I can tell you how to make this possible.

Database performance is critical to any application that relies on data storage and retrieval. Performance tuning is the process of administering a database to improve its performance. This can include optimizing SQL statements and query execution plans to complete requests efficiently.

This article will discuss performance tuning in different database engines and how artificial intelligence (AI) can help with automatic processes.

Skill No 3 – Monitoring Databases

What is Database Monitoring, and Why is it so Important?

Monitoring databases is essential for a database management professional, as it helps ensure the database systems’ health and performance.

Database monitoring, or database performance monitoring, is the practice of monitoring databases in real time. By tracking specific metrics, database monitoring enables teams to understand the health and behavior of their database systems. This helps with troubleshooting and finding ways to optimize database performance 1.

With more data than ever being created, making the right choices among the countless options for data management and analytics is a top priority for many organizations. To help organizations progress along their data-driven journeys, each year, DBTA presents the Readers’ Choice Awards, which allow recognized companies whose products have been selected by experts like you……and, being modest me 🙂

What is database Performance?

Database performance refers to optimizing resource use to increase throughput and minimize contention, enabling the most significant possible workload to be processed4. Five key factors influence database performance: workload, throughput, resources, optimization, and contention5.

  1. Workload: The workload requested by the DBMS defines the demand. It combines online transactions, batch jobs, ad hoc queries, data warehousing analysis, utilities, and system commands directed through the system at any given time. The workload can fluctuate drastically from day to day, hour to hour, and even minute to minute6.
  2. Throughput: Throughput defines the overall capability of the hardware and software to process data. It is a composite of I/O speed, CPU speed, machine parallel capabilities, the DBMS core, and the efficiency of the operating system and system software6.
  3. Resources: The hardware and software tools at the system’s disposal are the system’s resources. Examples include database kernel, disk space, memory, cache controllers, and microcode6.
  4. Optimization: The fourth defining element of database performance is optimisation. All systems can be optimised, but many database systems can perform query optimization primarily accomplished internally to the DBMS. Yet other factors need to be optimised (SQL formulation, database parameters, database organisation, etc.) to enable the database optimiser to create the most efficient access paths to the data6.
  5. Contention: When a particular resource’s demand (workload) is high, contention can result. Contention is when two or more components of the workload attempt to use a single resource in a conflicting way (for example, dual updates to the same piece of data). As contention increases, throughput decreases6.

Several tools are available for monitoring databases in different engines, such as Oracle, SQL Server, and PostgreSQL. Effective database monitoring helps you identify and quickly resolve performance issues to maintain database server availability and ensure databases continuously provide organisations with the services and applications that drive their daily business 1.

Monitoring databases is an essential skill for a database management professional; In conclusion, it helps ensure the database systems’ health and performance.

AI and Data Management: A Powerful Partnership

Artificial intelligence (AI) is transforming every industry and every aspect of our lives. From healthcare to education, from entertainment to finance, AI is enabling new possibilities and creating new value. But AI is not a magic wand that can solve any problem without any effort. AI depends on data – lots of data – to learn, improve, and deliver accurate and reliable outcomes. And data needs management – effective management – to ensure its quality, accessibility, security, and integration.

What exactly does a Data Management Engineer do?

Data management is the process of collecting, organizing, storing, processing, analyzing, and sharing data in a way that supports business goals and complies with regulations. Data management involves various tasks such as data classification, cataloguing, quality control, security enforcement, and data integration. These tasks are often labour-intensive and error-prone when done manually or with traditional tools.

How AI can help to do our job more efficiently?

This is where AI can help. AI can automate and simplify tasks related to data management across discovery, integration, cleansing, governance, and mastering. AI can improve data understanding and identify privacy and quality anomalies. AI can also enhance data security by detecting threats and enforcing policies. AI can enable data integration by matching records across sources and resolving conflicts.
By using AI for data management, Database Specialist can benefit from:
  1. Improved productivity: AI can reduce the time and effort required for data management tasks by automating repetitive or complex operations.
  2. Increased accuracy: AI can improve the quality and consistency of data by identifying errors or inconsistencies and correcting them.
  3. Enhanced scalability: AI can handle large volumes of data from various sources without compromising performance or reliability.
  4. Greater agility: AI can adapt to changing business needs or regulatory requirements by learning from feedback or new information.
  5. Higher value: AI can unlock the potential of data by providing insights or recommendations that support decision-making or innovation.
Some examples of how AI can be useful for data management are:
  • Classification: AI can extract relevant information from documents, images, videos, or other media using natural language processing (NLP), computer vision (CV), or speech recognition techniques.
  • Cataloguing: AI can help locate data by indexing it based on metadata or content analysis using NLP or CV techniques.
  • Quality: AI can reduce errors in the data by validating it against rules or standards using NLP or machine learning (ML) techniques.
  • Security: AI can keep data safe from unauthorized access or misuse by applying encryption or masking techniques using ML techniques.
  • Integration: AI can help merge data from different sources by matching records based on similarity measures using ML techniques.
  • Data Cleansing: AI can automatically identify and correct errors in data by using machine learning algorithms. For example, AI can identify and remove duplicate data, identify outliers, and standardize data.
  • Data Governance: AI can help enforce data governance policies by identifying and flagging any inconsistencies or violations in the data. AI can also provide recommendations for policies based on the analysis of the data.
  • Data Discovery: AI can help discover hidden patterns and insights in large datasets that are difficult for humans to uncover. For example, AI can identify trends and correlations between different data sets and provide recommendations for further analysis.
  • Data Mastering: AI can help improve data accuracy and completeness by merging, standardizing, and de-duplicating data from different sources. This can help reduce data inconsistencies and improve data quality.
AI and data management are a powerful partnership that can enable organizations to leverage their most valuable asset – their data – in a more efficient and effective way. By combining human expertise with machine intelligence, organizations can achieve better outcomes faster while reducing costs and risks. If you want to learn more about how AI can help you with your data management challenges, please contact me, and we will be happy to assist you.
Regards;

Oracle Database New Versions – Licensing & Timelines

Introduction

Oracle Database 21c and 23c are the latest versions of the Oracle Database available in the market. Both of these versions offer an enhanced set of features and come with a flexible licensing model. The key highlights of this version include improved performance, better scalability, higher cloud compatibility, and enhanced security features.

As a project manager or solution architect, staying up-to-date with the latest database technology is important to ensure your organisation runs efficiently and effectively. Oracle has recently released two new database versions, Oracle Database 21c and 23c, which offer several new features and updates. In this article, we’ll discuss the licensing types available in Oracle Database 21c and how long it is supported, highlight the new features, and compare the installation cost in Oracle Cloud OCI with AWS EC2 instances.

Types of Licensing in Oracle Database 21c

Oracle Database 21c offers different licensing options depending on your organization’s needs. The first is Standard Edition 2, which is designed for small to medium-sized businesses that require basic functionality. The Enterprise Edition offers more advanced features, such as high availability and advanced security. The Enterprise Edition can also be licensed on a per-user basis or per-core basis, depending on your organization’s needs.

Oracle Database Offering Offering Type Abbreviated Name Used in this Guide Description
Oracle Database Standard Edition 2 On-Premises

SE2

Oracle Database Standard Edition 2 includes features necessary to develop workgroup, department-level, and Web applications.

Oracle Database Enterprise Edition On-Premises

EE

Oracle Database Enterprise Edition provides performance, availability, scalability, and security for developing applications such as high-volume online transaction processing (OLTP) applications, query-intensive data warehouses, and demanding Internet applications.

Oracle Database Enterprise Edition can be enhanced with the purchase of Oracle Database options and Oracle management packs.

Oracle Database Enterprise Edition on Engineered Systems On-Premises EE-ES

Oracle Database Enterprise Edition software installed on an on-premises engineered system (Oracle Exadata Database Machine or Oracle Database Appliance).

Includes all of the components of Oracle Database. You can further enhance this offering with the purchase of Oracle Database options and Oracle management packs.

The licensing policies for EE-ES vary depending on whether it is installed on Oracle Exadata Database Machine or Oracle Database Appliance. Be sure to make note of these differences, which are documented in the subsequent sections of this guide.

 

Per-User Licensing
Per-user licensing allows an organization to license the Oracle Database based on the number of users who will be accessing the database. Each user requires a license, regardless of how many physical or virtual cores are used by the database. This option can be cost-effective for organizations with a relatively small number of users accessing the database.

Per-Core Licensing:
Allows an organization to license the Oracle Database based on the number of physical or virtual cores used by the database. The number of licenses required is based on the number of cores allocated to the database, regardless of the number of users accessing the database. This option can be cost-effective for organizations with a larger number of users accessing the database or for databases that require a high number of cores to perform efficiently. It’s important to note that there are different types of per-core licensing models available, including named-user plus and processor-based licensing. These licensing models can affect the cost of licensing and should be carefully considered when choosing a licensing option.

It’s important to note that Oracle offers a new type of license for Oracle Database 21c called the Universal License. This license allows customers to use any feature in the Enterprise Edition for a flat fee. This is a departure from the previous model, where customers would have to purchase individual licenses for each feature they wanted to use.

More about License in Oracle

Oracle Database 21c Support

Oracle Database 21c is currently supported until at least 2026, giving organizations ample time to upgrade from previous versions. It’s important to note that support for previous versions, such as Oracle Database 19c, will eventually end, so it’s a good idea to start planning your upgrade path sooner rather than later.

New Features in Oracle Database 21c

Oracle Database 21c offers several new features and updates that can benefit your organization. Here are a few of the most notable:

Blockchain Tables
Oracle Database 21c now includes blockchain tables, allowing organizations to store and manage blockchain data. This can simplify blockchain deployment and management for organizations that use blockchain technology. Using the Blockchain Tables feature in Oracle Database 21c requires an additional license.
Native JSON Data Type:
Oracle Database 21c now includes a native JSON data type, making storing and manipulating JSON data within the database easier.

No, the use of Native JSON Datatype in Oracle Database 21c does not require an additional license. This feature is included in the standard license of Oracle Database and can be used without any additional licensing cost.
Automatic Indexing:
Oracle Database 21c includes automatic indexing, which uses machine learning algorithms to create and manage indexes automatically. This can improve query performance and reduce the need for manual index management.

Yes, the use of Automatic Indexing in Oracle Database 21c requires an additional license. Organizations that want to use this feature must purchase the Oracle Tuning Pack license, a separate add-on license.
In-Memory Database:
Oracle Database 21c includes a new in-memory database feature that allows for faster performance and processing of large datasets. This feature is useful for applications that require real-time processing of large volumes of data, such as financial trading systems, online gaming, and social media.

Yes, the use of In-Memory Database in Oracle Database 21c requires an additional license. Organizations that want to use this feature will need to purchase the Oracle Database In-Memory option, which is a separate add-on license.
Multitenant Enhancements:
Oracle Database 21c includes several enhancements to its multitenant architecture, which allow for more efficient and secure sharing of database resources among multiple tenants. This feature is useful for cloud-based applications and service providers that manage multiple databases and customers on a single server.
Hybrid Partitioned Tables:
Oracle Database 21c introduces hybrid partitioned tables, which allow for the combination of range and list partitioning in a single table. This feature is useful for applications that have complex data access patterns, such as analytics and reporting. No, the use of Hybrid Partitioned Tables does not require an additional license in Oracle Database 21c.
Cost of Installation in Oracle Cloud OCI and AWS EC2 Instance

Installing Oracle Database 21c in Oracle Cloud OCI is relatively straightforward. Oracle Cloud OCI offers several different pricing options, including pay-per-use and subscription models. The cost will depend on factors such as the size of your database, the number of users, and the level of support you require.

Comparing the cost of installation in Oracle Cloud OCI to AWS EC2 instances can be difficult, as the cost will depend on a variety of factors. However, Oracle Cloud OCI offers several advantages, such as built-in support for Oracle Database and seamless integration with other Oracle Cloud services.

Per-core licensing is a licensing model in which an organization licenses the Oracle Database based on the number of physical or virtual cores used by the database. There are different types of per-core licensing models available, including named-user plus and processor-based licensing. Here are the differences and examples of each:

Named-User Plus Licensing: Named-user plus licensing is a per-core licensing model that requires an organization to license the Oracle Database based on the number of users who will be accessing the database. Each named user requires a license, regardless of how many physical or virtual cores are used by the database. This option can be cost-effective for organizations with a relatively small number of users accessing the database.

For example, if an organization has 50 named users who will be accessing the database, and the database is running on a server with 16 cores, the organization would need to purchase 50 named-user-plus licenses. The number of cores used by the database would not affect the number of licenses required.

Processor-Based Licensing: Processor-based licensing is a per-core licensing model that requires an organization to license the Oracle Database based on the number of physical or virtual cores used by the database. The number of licenses required is based on the number of cores allocated to the database, regardless of the number of users accessing the database. This option can be cost-effective for organizations with a larger number of users accessing the database or for databases that require a high number of cores to perform efficiently.

For example, if an organization has 500 users who will be accessing the database, and the database is running on a server with 16 cores, the organization would need to purchase licenses for each core on the server. If the organization chooses processor-based licensing, it would need to purchase eight (8) licenses, one for each core on the server.

Estimating the cost of Processor-Based Licensing for an OCI Server VM.Standard 2.8 with 8 VCPUs would depend on several factors, such as the version of the Oracle Database being used, the level of support required, and any additional Oracle products or options being used.

However, as an example, let’s assume you are using Oracle Database Standard Edition 2 and require processor-based licensing. According to Oracle’s current pricing (as of March 2023), the cost of a processor license for Oracle Database Standard Edition 2 is $17,500 USD.

For a VM.Standard2.8 instance with 8 VCPUs, you would need to purchase eight (8) processor licenses. Therefore, the estimated cost of processor-based licensing for this instance would be 8 x $17,500 = $140,000 USD.

It’s important to note that this is just an estimation, and the actual cost may vary depending on the specific licensing requirements and any discounts or promotions available from Oracle. It’s recommended to consult with an Oracle representative to get a more accurate estimate based on your specific licensing needs.

Example No 2 – Cost

Let’s say we’re looking at purchasing a Dell EMC PowerEdge R740 server with two Intel Xeon Silver 4214 CPUs. According to Dell’s website, the list price for this server configuration is approximately $9,500.

It’s important to note that this price is for the server hardware only and does not include any software licenses or additional services that may be required. The total cost of a server with more than one CPU will depend on various factors, including the brand and model of the server, the number of CPUs, the amount of memory and storage, and any additional software licenses or services that may be required. Organizations should carefully consider their requirements and budget before selecting a server configuration and pricing option.

Conclusion

Oracle Database 21c and 23c offer several new features and updates that can benefit organizations of all sizes. Understanding the types of licensing available, support timelines, and new features is essential for project managers and solution architects to make informed decisions.

Configure Data Access – Column Level Encryption

Introduction

In today’s world, data security is more critical than ever. The ability to encrypt data at the column level is a critical capability in any modern database engine with the increasing number of data breaches, organizations must protect their sensitive information from unauthorized access. One way to accomplish this is through column-level encryption in SQL Server also applicable in Oracle. In this article, we will discuss how to implement column-level, including examples and use cases, while emphasizing the importance of security and performance efficiency.

What is Column-Level Encryption?

Column-level encryption is a method of encrypting individual columns within a database table. This type of encryption protects sensitive information such as credit card numbers, social security numbers, and personal information. When implemented correctly, column-level encryption can provide a higher level of security than database-level encryption.

To understand and implement encryption in SQL Server you need to understand its encryption hierarchy and key management architecture. Layers of encryption are protected by preceding layers of encryption that can use asymmetric keys, certificates, and symmetric keys.

Extensible Key Management SQL Server EKM
enables the encryption keys that protect the database files to be stored outside of the SQL Server environment such as a smartcard, a USB device, and the EKM module of Hardware Security Module (HSM). It also helps secure the SQL Server instance from database administrators because they will not necessarily have access to the external EKM/HSM module.

Service Master Key The Service Master Key (SMK)
is the root of the database engine’s encryption hierarchy and is generated automatically the first time it is needed to encrypt another key. By default, the SMK is encrypted using the Windows data protection API (DPAPI) at the operating system level, which uses the local machine key. The SMK can only be opened by the Windows service account that created it, or by a principal that knows the service account name and its password.

Database Master Key
The Database Master Key (DMK) is a symmetric key used to protect the private keys of certificates and asymmetric keys that are present in the database. When created it is encrypted using AES_256 and a password you provide. Query the [sys].[symmetric_keys] catalog view to get information about the DMK.

Asymmetric Key
An asymmetric key consists of a private and corresponding public key. Asymmetric encryption is computationally more expensive, but more secure than symmetric encryption. You can use an asymmetric key to encrypt a symmetric key within a database.

Symmetric Key
A symmetric key is a single key that uses encryption.
Symmetric encryption is generally used over asymmetric encryption because it is faster and less computationally expensive.
Certificate Certificates are a digitally signed security object that contain a public (and optionally a private) key for SQL Server, which can generate certificates. You can also use externally generated certificates, and just like with asymmetric keys, certificates can be used in asymmetric encryption.

Implementing Column-Level Encryption in SQL Server

SQL Server offers two options for column-level encryption: symmetric and asymmetric encryption. Let’s take a look at both options.

Symmetric Encryption
Uses the same key to encrypt and decrypt data. This type of encryption is faster and more efficient than asymmetric encryption. However, it is less secure since the same key is used to encrypt and decrypt the data.

To implement symmetric encryption in SQL Server, follow these steps:

  • Step 1: Create a Master Key

The first step is to create a master key that will be used to protect the symmetric key. You can create a master key using the following T-SQL command:

CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'VeryStrongPasswd';
  • Step 2: Create a Symmetric Key

The next step is to create a symmetric key that will be used to encrypt and decrypt the data. You can create a symmetric key using the following T-SQL command:

CREATE SYMMETRIC KEY MySymmetricKey
WITH ALGORITHM = AES_256
ENCRYPTION BY PASSWORD = 'MyStrongPassword';
  • Step 3: Encrypt the Column

Once you have created the symmetric key, you can use it to encrypt the column using the following T-SQL command:

UPDATE MyTable SET MyColumn = ENCRYPTBYKEY(KEY_GUID('MySymmetricKey'), MyColumn);

To decrypt the column, you can use the following T-SQL command:

SELECT CONVERT(varchar, DECRYPTBYKEY(MyColumn)) AS MyColumn
FROM MyTable;

Asymmetric Encryption
Uses a public key to encrypt the data and a private key to decrypt it. This type of encryption is more secure than symmetric encryption but is slower and less efficient.
To implement asymmetric encryption in SQL Server, follow these steps:

  • Step 1: Create a Master Key

The first step is to create a master key that will be used to protect the asymmetric key. You can create a master key using the following T-SQL command:

CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'MyStrongPassword';
  • Step 2: Create an Asymmetric Key

The next step is to create an asymmetric key pair. You can create an asymmetric key pair using the following T-SQL command:

CREATE ASYMMETRIC KEY MyAsymmetricKey
WITH ALGORITHM = RSA_2048 ENCRYPTION BY PASSWORD = 'MyStrongPassword';
  • Step 3: Create a Certificate

Once you have created the asymmetric key pair, you can create a certificate using the following T-SQL command:

CREATE CERTIFICATE MyCertificate
AUTHORIZATION dbo FROM ASYMMETRIC KEY MyAsymmetricKey;
  • Step 4: Encrypt the Column

Finally, you can use the certificate to encrypt the column using the following T-SQL command:

UPDATE MyTable
SET MyColumn = ENCRYPTBYCERT(CERT_ID('MyCertificate'), MyColumn);

Check this out another exercise!

USE tempdb; GO 

-- Create sample table 

CREATE TABLE Employees (EmployeeID INT PRIMARY KEY,EmployeeName VARCHAR(300),Position VARCHAR(100), Salary VARBINARY(128) ); GO 

-- Create SMK 

CREATE SYMMETRIC KEY SMK_Emp WITH ALGORITHM = AES_256 ENCRYPTION BY PASSWORD = 'Pa$$w0rd'; GO 

-- Open SMK 

OPEN SYMMETRIC KEY SMK_Emp DECRYPTION BY PASSWORD = 'Pa$$w0rd'; GO 

-- Verify open keys 

SELECT * FROM sys.openkeys; GO 

-- Insert data 

INSERT Employees VALUES (1, 'Marcus', 'CTO', ENCRYPTBYKEY(KEY_GUID('SMK_Emp'),'$100000')); INSERT Employees VALUES (2, 'Christopher', 'CIO',ENCRYPTBYKEY(KEY_GUID('SMK_Emp'),'$200000')); INSERT Employees VALUES (3, 'Isabelle', 'CEO',ENCRYPTBYKEY(KEY_GUID('SMK_Emp'),'$300000')); GO 

-- Query table with encrypted values 

SELECT * FROM Employees; GO 

-- Query table with decrypted values 

SELECT *, CONVERT(VARCHAR, DECRYPTBYKEY(Salary)) AS DecryptedSalary FROM Employees; GO 

-- Close SMK 

CLOSE SYMMETRIC KEY SMK_Emp GO 

-- Query table with decrypted values after key SMK is closed 

SELECT *, CONVERT(VARCHAR, DECRYPTBYKEY(Salary)) AS DecryptedSalary FROM Employees; GO 

-- Clever CTO updates their salary to match CEO's salary 

UPDATE Employees SET Salary = (SELECT Salary FROM Employees WHERE Position = 'CEO') WHERE EmployeeName = 'Marcus'; GO 

-- Open SMK and query table with decrypted values 

OPEN SYMMETRIC KEY SMK_Emp DECRYPTION BY PASSWORD = 'Pa$$w0rd'; SELECT *, CONVERT(VARCHAR, DECRYPTBYKEY(Salary)) AS DecryptedSalary FROM Employees; GO 

-- Cleanup (Optional) 
DROP TABLE Employees; 
DROP SYMMETRIC KEY SMK_Emp; 
GO
Need more Review? Encryption authenticators
For more information on authenticators SQL Server supports a number of different system functions that support encryption, decryption, digital signing and validation of digital signatures. To familiarize yourself with these functions visit https://technet.microsoft.com/enus/library/ms365192(v=sql.105).aspx.

 

Why the transaction logs grows are so large?

A Shorter Answer:

You probably either have a long-running transaction running (Index maintenance? Big batch delete or update?), or you are in the “default” (more below on what is meant by default) recovery mode of Full and have not taken a log backup (or aren’t taking them frequently enough).

If it is a recovery model issue, the simple answer could be to Switch to Simple recovery mode if you do not need point-in-time recovery and regular log backups. Many people, though, make that their answer without understanding recovery models. Read on to understand why it matters and then decide what you do. You could also just start taking log backups and stay in Full recovery.

There could be other reasons, but these are the most common. This answer begins to dive into the most common two reasons and gives you some background information on the why and how behind the reasons as well as explores some other reasons.

A longer Answer:

What scenarios can cause the log to keep growing? There are many reasons, but usually, these reasons are of the following two patterns: There is a misunderstanding about recovery models, or there are long-running transactions.

recovery model is a database property that controls how transactions are logged, whether the transaction log requires (and allows) backing up, and what restore operations are available. Three recovery models exist simple, full, and bulk-logged. Typically, a database uses the full recovery model or simple recovery model. A database can be switched to another recovery model at any time. We are not talking about of bulk-logged recovery mode because is not part of the normal University’s architecture.

Recovery in General Concept

  • Crash/Restart Recovery

One purpose of the transaction log file is for crash/restart recovery. For the rolling forward and rolling back of work that was either done (rolling forward/redo) before a crash or restart and the work that was started but not finished after a crash or restart (rolling back/undo). The transaction log’s job is to see that a transaction started but never finished (rolled back or crash/restart happened before the transaction was committed). In that situation, It is the log’s job to say, “Hey… this never really finished, let’s roll it back” during recovery. It is also the log’s job to see that you did finish something and that your client application was told it was finished (even if it hadn’t yet hardened to your data file) and say, “Hey… this really happened, let’s roll it forward, let’s make it like the applications think it was” after a restart. Now there is more, but that is the main purpose.

  • Point-in-Time Recovery

The other purpose for a transaction log file is to be able to give us the ability to recover to a point in time due to an “oops” in a database or to guarantee a recovery point in the event of a hardware failure involving the data and/or log files of a database. Suppose this transaction log contains the records of transactions that have been started and finished for recovery. In that case, SQL Server can and does then use this information to get a database to where it was before an issue happened. But that isn’t always an available option for us. For that to work, we have to have our database in the right recovery model and take log backups.

Recovery Models

  • Simple Recovery Model

With the above introduction, it is easiest to talk about Simple Recovery model first. In this model, you are telling SQL Server: “I am fine with you using your transaction log file for crash and restart recovery…” (You really have no choice there. Look up ACID properties and that should make sense quickly.) “…but once you no longer need it for that crash/restart recovery purpose, go ahead and reuse the log file.”

SQL Server listens to this request in Simple Recovery and it only keeps the information it needs to do crash/restart recovery. Once SQL Server is sure it can recover because data is hardened to the data file (more or less), the data that has been hardened is no longer necessary in the log and is marked for truncation – which means it gets re-used.

  • Full Recovery Model

With Full Recovery, you are telling SQL Server that you want to be able to recover to a specific point in time, as long as your log file is available or to a specific point in time that is covered by a log backup. In this case when SQL Server reaches the point where it would be safe to truncate the log file in Simple Recovery Model, it will not do that. Instead It lets the log file continue to grow and will allow it to keep growing, until you take a log backup (or run out of space on your log file drive) under normal circumstances.

Recovery Models Source Info

https://docs.microsoft.com/en-us/sql/relational-databases/backup-restore/recovery-models-sql-server?view=sql-server-2017

Example No 1 When your switch from Simple to Full

There are rules and exceptions here. We’ll talk about long-running transactions in depth below.

But one warning to keep in mind for Full Recovery Mode is this: If you just switch into Full Recovery mode but never take an initial Full Backup, SQL Server will not honour your request to be in the Full Recovery model. Your transaction log will continue to operate as it has in Simple until you switch to Full Recovery Model AND take your first Full Backup.

Full Recovery Model without log backups is bad.

That’s the most common reason for uncontrolled log growth. Answer: Being in Full Recovery mode without having any log backups. Real common mistake.

Why is this such a common mistake?

Why does it happen all the time? Because each new database gets its initial recovery model setting by looking at the model database.

The model’s initial recovery model setting is always Full Recovery Model – until and unless someone changes that. So you could say the “default Recovery Model” is Full. Many people are unaware of this and have their databases running in Full Recovery Model with no log backups; therefore, a transaction log file is much larger than necessary. This is why it is important to change defaults when they don’t work for your organisation and its needs)

Full Recovery Model with too few log backups is bad.

You can also get yourself in trouble here by not taking log backups frequently enough. Taking a log backup a day may sound fine, it makes a restore require less restore commands, but keeping in mind the discussion above, that log file will continue to grow and grow until you take log backups. That point should not be a problem with Maint – DatabaseBackup – ALL_DATABASES – LOG job.

How do I find out what log backup frequency I need? Answer that you need respond when you are designing your solution.

You need to consider your log backup frequency with two things in mind:

  1. Recovery Needs– This should hopefully be first. In the event that the drive housing your transaction log goes bad or you get serious corruption that affects your log backup, how much data can be lost? If that number is no more than 10-15 minutes, then you need to be taking the log backup every 10-15 minutes, end of discussion.
  2. Log Growth– If your organisation is fine with losing more data because of the ability to recreate that day easily, you may be fine to have a log backup much less frequently than 15 minutes. Maybe your organization is fine with every 4 hours. But you must look at how many transactions you generate in 4 hours. Will allowing the log to keep growing in those four hours make too large of a log file? Will that mean your log backups take too long?

Querying the sys.databases catalog view, you can see information describing why your log file may be waiting on truncate/reuse.

There is a column called log_reuse_wait with a lookup ID of the reason code and a log_reuse_wait_desc column with a description of the wait reason. From the referenced books and online article are the majority of the reasons (the ones you are likely to see and the ones we can explain reasons for. What does means these columns?

select name, log_reuse_wait_desc, log_reuse_wait from sys.databases
  • 0 = Nothing
    What it sounds like.. Shouldn’t be waiting
  • 1 = Checkpoint
    Waiting for a checkpoint to occur. This should happen and you should be fine – but there are some cases to look for here for later answers or edits.
  • 2 = Log backup
    You are waiting for a log backup to occur. Either you have them scheduled and it will happen soon, or you have the first problem described here and you now know how to fix it
  • 3 = Active backup or restore
    A backup or restore operation is running on the database
  • 4 = Active transaction
    There is an active transaction that needs to complete (either way – ROLLBACK or COMMIT) before the log can be backed up. This is the second reason described in this answer.
  • 5 = Database mirroring
    Either a mirror is getting behind or under some latency in a high performance mirroring situation or mirroring is paused for some reason
  • 6 = Replication
    There can be issues with replication that would cause this – like a log reader agent not running, a database thinking it is marked for replication that no longer is and various other reasons. You can also see this reason and it is perfectly normal because you are looking at just the right time, just as transactions are being consumed by the log reader
  • 7 = Database snapshot creation
    You are creating a database snapshot, you’ll see this if you look at just the right moment as a snapshot is being created
  • 8 = Log Scan
    I have yet to encounter an issue with this running along forever. If you look long enough and frequently enough you can see this happen, but it shouldn’t be a cause of excessive transaction log growth, that I’ve seen.
  • 9 = An Always On Availability Groups secondary replica is applying transaction log records of this database to a corresponding secondary database. About the clearest description yet..

Some things you don’t want to do:

  • Back up the log with TRUNCATE_ONLY option and then SHRINKFILE. For one, this TRUNCATE_ONLY option has been deprecated and is no longer available in current versions of SQL Server. Second, if you are in a FULL recovery model, this will destroy your log chain and require a new, full backup.
  • Detach the database, delete the log file, and re-attach. I can’t emphasize how dangerous this can be. Your database may not come back up, it may come up as a suspect, you may have to revert to a backup (if you have one), etc. etc.
  • Use the “shrink database” option. DBCC SHRINKDATABASE and the maintenance plan option to do the same are bad ideas, especially if you really only need to resolve a log problem issue. Target the file you want to adjust and adjust it independently, using DBCC SHRINKFILE or ALTER DATABASE … MODIFY FILE (examples above).
  • Shrink the log file to 1 MB. This looks tempting because, hey, SQL Server will let me do it in certain scenarios, and look at all the space it frees! Unless your database is read-only (and it is, you should mark it as such using ALTER DATABASE), this will absolutely just lead to many unnecessary growth events, as the log has to accommodate current transactions regardless of the recovery model. What is the point of freeing up that space temporarily just so SQL Server can take it back slowly and painfully?
  • Create a second log file. This will provide temporary relief for the drive that has filled your disk, but this is like trying to fix a punctured lung with a band-aid. You should deal with the problematic log file directly instead of just adding another potential problem. Other than redirecting some transaction log activity to a different drive, a second log file really does nothing for you (unlike a second data file), since only one of the files can ever be used at a time. Paul Randal also explains why multiple log files can bite you later.

Credits

https://docs.microsoft.com/en-us/sql/relational-databases/backup-restore/recovery-models-sql-server?view=sql-server-2017

Paul S. Randal – https://www.sqlskills.com/blogs/paul/sqlskills-sql101-switching-recovery-models/

https://dba.stackexchange.com/users/1192/paul-white

https://www.brentozar.com/archive/2016/03/my-favorite-system-column-log_reuse_wait_desc/