Access: Importing Excel SpreadsheetsBy
Data imported from Excel often contains records that should be in more than one table. To import it properly, the proper data structure needs to be established, and then data will be put where it belongs.
Figure 1 shows an Excel worksheet containing data for products ordered in April 2018. Each month is on a different sheet within the workbook. Importing the data from this workbook into Access will make reporting easier, allow for more flexibility, and make it simpler to correct or minimize mistakes in the data, such as misspelled names.
Since each sheet in the Excel workbook uses the same column headings, it might be tempting to simply create a table in Access containing the same fields—but the way data is organized in Excel isn’t necessarily the best way to organize it in Access. Leaving it in one table might lead to challenges later. So let’s look at the data.
The first column contains the day of the month that the order was taken, but the data isn’t formatted as a date—it’s simply a number. To get the actual date, you’d need to combine the value in column A, the sheet name (Apr), and the year referenced in the file name (2018).
Next come the customer code and customer name. Column D indicates the type of record for any customer that isn’t an individual. Serial # identifies what product was purchased, Price indicates how much the customer paid, Mfg gives the date the product was manufactured, and SRP is the suggested retail price.
Already, we’re starting to see potential issues with the data file. Columns C and D describe the customer, not the order. And, as you can see in rows 1 and 2, the name used for a particular customer isn’t always consistent, while information isn’t always added to Column D. The note in Column I also appears related to the customer rather than the order.
Data structure is an important element of database design. It refers to the collection of tables and fields that define how your data looks. Consider, for example, a simple ordering process where Customer ABC buys XYZ Product as part of Order 123. Although data may come in with all the detail for each of those items, customers, products, and orders are three very different objects. Unlike how they’re organized in the Excel workbook, they should be separated into their own tables. Each object is in a table with a list of like objects, and each row, or record, contains a set of fields that describes the characteristics for that object.
Once the elements are divided into proper tables, the next step is determining their relationships. Figure 2 shows a relationships diagram of customers, products, and orders. The Productz table has an AutoNumber primary key called ProductID. SerialNum has a unique index to prevent duplicates. There are also fields for the manufacture date and suggested retail price.
The Customerz table has an AutoNumber primary key called CustomerID. CustCode is for the customer code and is unique. The customer name is split into main name (last name for a human or company name) and first name. CustomerCalc is a calculated field with an expression to combine the parts of the names for reporting: [CustomerMain] & IIf([CustomerA] Is Null, “”, “, ” & [CustomerA]).
The Orderz table has an AutoNumber primary key called OrderID. ProductID is a Long Integer foreign key that identifies a product. CustomerID is a Long Integer foreign key that identifies a customer. There also are fields for the order date, the amount of the order, and a note.
Each of the tables has a tracking field called dtmAdd, which has a default value of Now(), so a value will be added automatically when records are created. Each table name ends with “z” to avoid using any reserved words. Using a reserved word could lead to errors when referring to the field.
There are two relationships with referential integrity: the relationship between customers and products on the CustomerID field and the relationship between products and orders on the ProductID field.
IMPORT DATA FROM EXCEL
To import data from Excel, go to the External Data tab and choose Excel. If you don’t see the Excel icon in the Import section, click on the New Data Source dropdown and choose it from there. When the dialog box appears, click Browse… and choose your Excel workbook file. In this case, it’s EXCEL_Orders_2018.xlsx. Click the Import option, then OK. Choose the first worksheet, Jan, and click the Next button. Check “First Row Contains Column Headings” and click Next. Then choose data types for each field. Change the data type for Dy (day) from Double to Integer. Click Next. Let Access add a primary key and click Next. Change the suggested table name to “importJan” and then click Finish. Close without saving the steps.
If the spreadsheets had a field that identified the month for each sheet, the next sheets could be appended to the same table. In this case, however, we’ll need to either calculate the order date and update it before adding more data or use a temporary field to distinguish the month.
We will continue this import process next month to modify the import data structure for the date calculation, combine data from all the sheets, then use Append and Update queries to shuffle the data into each of the three tables. We’ll also separate names into last name and first name.
Download the files for this month:
After importing information from Excel, create normalized tables with relationships in Access so that the data structure is patterned after objects in the real world.