Data Manipulation

Data Manipulation

Data are facts or statistics collected for analysis. They can be in various forms such as; Names, Ages, Weights, Addresses, Temperatures, distances and dates. These various forms of data can be used to conclude

Data Manipulation

Data manipulation is the process of organizing, interpreting and arranging data to make it easy to understand.

Examples;

  • In making company decisions, by giving an insight into how the company can be in a few years.

  • In companies for seeing the areas where they been to improve.

  • In the stock market, for creating forecasts in trends.

  • Arranging in alphabetical order for individual entered.

Tools for data manipulation.

Many tools can be used to manipulate data, such as.

Excel, Structured Query language (SQL), Pon, Tableau, Panda, Stata, Orange, RapidMiner, OpenRefine, Apache Spark.

Data manipulation in excel.

In excel one can manipulate data by; adding columns, formatting, hiding columns and unhiding them.

  • Data is imported from an external source.

1 . Copy and paste the data into a new sheet. This is to keep the data safe during cleaning

  • To remove duplicates.
  1. Click on the data tab

  2. Select remove duplicate.

  3. Click on okay.

  • To delete empty cells.
  1. Click on find and select.

  2. Click on Special.

  3. Choose blank and all the empty cells are highlighted.

  4. Go to the home tab and click delete rows.

  5. All rows with empty cells are deleted.

    For manipulating a column

  1. Insert a new column in B.

  2. Write the formula in B1. Eg to change to upper case =UPPER(B2).

  3. click enter

  4. Double-click on the end of the cell, and all the columns will change to

    uppercase.

  5. Copy column B and paste it as a value

  6. Delete column A and column B automatically becomes column A.

Data Manipulation Language (DML).

Data manipulation language (DML), is a computer programming language used for selecting, inserting, deleting and updating.

Data manipulation language is mostly incorporated in SQL.

DML commands include;

  1. SELECT:

    This command is used to select data from a table. The command syntax is SELECT [column name(s)] from [table name] where [conditions]. SELECT is the most widely used DML command in SQL.

  2. INSERT:

    This command is used to add data to an existing table. The command syntax is INSERT INTO [table name] [column(s)] VALUES [value(s)].

  3. UPDATE:

    It is a command used to modify a table. The command syntax is UPDATE [table name] SET [column name = value] where [condition].

  4. DELETE:

    This is a command used for deleting an existing table record. The command syntax is DELETE FROM [table name] where [condition].

Tips for data manipulation.

  • Know the information you need to get.

  • Automation tools are helpful.

  • Learn mathematics functions that can be applied.

  • Filter your data to find the specific result.

  • Use data visualization to present manipulated data.