Three categories that describe the nature of information resources
What are three categories that describe the nature of information resources? Give an example of each. How do you characterize the relationships within each category of information?
Structured information: A sales transaction with clearly defined fields for date, customer number, item number, and amount. Relationships between structured information is straightforward and easily identifiable and this is captured by the fact that customer’s orders are related customers record and items purchased for the simple fact that
Unstructured information: Manila folder containing assorted items about a lawsuit, such as photos, handwritten notes, newspaper articles, or affidavits.
Semi-structured information: A web page with a title, subtitle, content, and a few images. The relationship with thus nature of information is often hard because some data with different formats are often combined and thus lack control completeness and formatting. Unlike unstructured data, however, they are easier to query and combine.
What is metadata? What does metadata describe for structured information? For unstructured information? Give an example of each type of metadata.
Metadata is data about data, and it clarifies the nature of the information. For structured
information, metadata describes the definitions of each of the fields, tables, and their relationships. For unstructured information, metadata are used to describe properties of a document or other resource, and are especially useful because they layer some structure on information that is less easily categorized and classified. YouTube’s database, for example, contains metadata about each of its videos that can be searched and sorted. A library’s card catalog provides metadata about the books, such as where they are physically shelved. Word-processed documents are easier to organize if they include title, author, and subject in their properties.
What are the characteristics of information that affect quality? What are examples of each?
Characteristic that affect quality of information: consistency, precision, Duplication, Timeliness, Bias, Completeness, and Accuracy.
Accuracy: spelling mistakes.
Precision: When measuring distance to a mall, an estimation is not wrong as it can still give useful information but in land survey, estimation cannot be accepted.
Completeness: When it comes to the ordering process, something like a zip code can be easily determined from an address, and thus can be omitted, but the house number cannot be omitted.
Consistency. Reports that show “total sales by region” may conflict because the people generating the reports are using slightly different definitions. When results are inconsistent, the quality of both reports is in question.
Timeliness. Outdated information has less value than up-to-date information and
thus is lower quality unless you are looking for historical trends. In stock trading, timeliness is measured in fractions of a second.
Bias. Biased information lacks objectivity, and that reduces its value and quality.
Duplication. Information can be redundant, resulting in misleading and exaggerated
summaries.
What were the early design approaches to managing information resources?
The very first way of managing information in the early days by putting documents in envelops, in rows of small pigeonholes that lined entire walls. Later, a new method was adopted which was lateral filing cabinet where vertical manila folders were created for record keeping.
What are the major disadvantages of file processing systems? What are four specific problems associated with file processing systems?
Disadvantages: Problems in accessing data, security problems, data isolation, inconsistency, data redundancy, and integrity problems.
Data Redundancy and Inconsistency: Given that most systems in organizations are not integrated and interconnected, there are a lot of information that could be inconsistent and redundant.
Lack of Data Integration: Integration of data from the separate systems is often a great challenge. A simple case in point is the fact that a payroll system maintains information about name, address, and pay history, but gender and ethnicity are in personnel records. If a manager wanted to compare pay rates by ethnicity, new programs were written to match up the records.
Inconsistent Data Definitions: When programmers write code to handle files, differences in format creep in. Phone numbers may include the dashes and be formatted as a text field in one system, but be treated as numbers in another. The other problems come with how people use a system. Data definitions may seem similar across systems, but they are used differently and summaries become misleading.
Data Dependence: Initial systems were hard to maintain because the programs and their files were so interconnected and dependent on one another. The programs all defined the fields and their formats, and business rules were all hard-coded or embedded in the programs. Even a minor change to accommodate a new business strategy took a lot of work.
Following the file processing model of data management, what three architectures emerged for integrated databases? What are the advantages of each? Are there disadvantages?
Hierarchal Model
This is a model which present data to its users in a hierarchy format which can be presented graphically by a sort of an inverted three. Hierarchical model, is adopted for systems that have simple structures and little linkages. The model is limited with elements that are on the same level as they do not have linkages, making it hard for example to create a query of the products purchased by a customer.
Network Model
In this model, there are no levels as compared to hierarchical models, and thus the problem of linking is solved by this model. Given that there is no level, the model gives a possibility of multiple number of links making it complex, slow and difficult to implement.
Relational Model
The model solves the problem of complexity and inflexibility so as to make it easy to implement many-to-many relationships between entities. The system adopt the use of key and non-key data with the key data being the one being adopted in the linking of data.
What are the steps in planning a relational data model? Are there benefits to the planning stage?
- Identifying the kind and type of information needed
- Identifying the list of fields and tables needed and entering the fields
- Identifying the primary key and the other key fields
- Developing and identifying the relationship between the tables and entities.
- Add a few sample data and normalize the data
Planning is crucial because database will help in smoothing a lot of operations and thus planning of all the details needs to be capture so as to simplify organizational operations. Time spent planning reaps benefits in time saved making changes later.
What are primary keys and foreign keys? How are they used to create links between tables in a relational database?
Primary key: A field, or a group of fields, that makes each record unique in a table
foreign keys:
Primary keys that appear as an attribute in a different table are a foreign key in that table. They can be used to link the records in two tables together.
The primary and foreign are used to create links between two tables in situation where a primary key in one table exists in another table and thus becomes the foreign key through which information from two related tables can be linked through the identified primary and foreign key. For example, if we have two tables, Employees and Department, they can be linked through the DepartmentID, which is the primary key in Departments and the foreign key in
Employees.
What is the typical strategy to access a database? How do users access an Access database? Are there other strategies to access database systems?
Databases are mostly accessed through the use of applications that have user friendly interfaces that enable them to securely enter, edit, delete, and retrieve data. The applications are usually user friendly and thus makes it easier for suppliers and customers to access a data base along with staff, with appropriate security controls. In MS Access, databases are accessed through the use form generating and report writing. The other technique majorly involves the use of directs methods against the database which majorly involves the use of SQL statements.
What is the role of the database administrator in managing the database? What is the career outlook for this job?
The database Administrator is responsible for the efficient operation of the company’s databases: monitoring and optimizing performance, troubleshooting bottlenecks, setting up new databases, enhancing security, planning capacity requirements, designing backup and disaster recovery plans, and working with department heads and the IT team to resolve problems and build innovative applications. In the united states, the job of a database administrators is one of the fastest growing careers, and those who specialize in the field have very attractive job prospects.
What is SQL? How is it used to query a database?
SQL is a standard query language, widely used to manipulate information in relational databases. Through SQL, queries that select data, insert, edit, or delete records can be performed. For more advanced level, joining function can be used to query data from multiple tables.
What is IVR? How is it used to query a database?
Interactive voice response (IVR) is a technology that facilitates access to the database from signals transmitted by telephone to retrieve information and enter data. Unlike SQL which uses commands, the technology adopt the use signals transmitted via the phone so as to access the
database, retrieve account information, and enter data.
What is a shadow system? Why are shadow systems sometimes used in organizations? How are they managed? What are the advantages of shadow systems? What are the disadvantages?
Shadow system is a set of smaller databases developed by individuals outside of the IT department that focus on their creator’s specific information requirements. The systems are usually adopted because they focus on their organization specific needs and are easier to create as they can be developed from Access and excels. The disadvantage of the system is that they often don’t contain what is in the corporate database and since its dependent on the creator, once they leave, it becomes hard for their replacement to catch up and understand its operations.
What is master data management? What is a data steward? What is the role of master data management in an organization’s integration strategy?
master data management is an approach that addresses the underlying inconsistencies in the way employees use data by attempting to achieve consistent and uniform definitions for entities and their attributes across all business units. Data Steward is a combination of watchdog and bridge
builder, a person who ensures that people adhere to the definitions for the master data in their organizational units. Master data management tries and create a uniform definition and attributes across all business while at the same time ensuring that there is minimal difference in data which makes it easier for various systems to be easily integrated.
What is a data warehouse? What are the three steps in building a data warehouse?
Data warehouse is a central data repository containing information drawn from multiple sources
that can be used for analysis, intelligence gathering, and strategic planning. The first step in building a database warehouse is to extract data from its home database, and then transform and cleanse it so that it adheres to common data definitions.
What are examples of internal sources of data for a data warehouse? What are examples of external sources of data for a data warehouse?
Internal Data Sources: Operational Data, Customer Data, Inventory, Manufacturing Data, Archived Historical data, Metadata drawn from multimedia, documents, among other sources, and Website traffic data.
External data source: Census Data, GPS/Mapping coordinates, and Competitor Information.
What are four examples of data warehouse architectures? Which approach is suitable to meet today’s growing demand for real-time information?
- Relational Database
- Data cubes,
- Virtual federated warehouse,
- Data Warehouse appliance
- NOSQL
- in memory database.
Virtual federated warehouse: This is useful in meeting the growing demand for real time information as it extraction and transformation of data in real time.
What is big data? What are the defining features of big data?
big data: Collections of data that are so enormous in size, so varied in content, and so fast to accumulate that they are difficult to store and analyze using traditional approaches. The defining features of big data are: variety, velocity, and Volume.
What is data mining? What is the difference between data mining and data dredging? What is the goal of data mining?
Data Mining is a type of intelligence gathering that uses statistical techniques to explore records in a data warehouse, hunting for hidden patterns and relationships that are undetectable in routine reports. Unlike Data mining, Data dredging is a process of analyzing big with a few of finding any possible relationships. Data mining on the other hand, has a clear subject study which begins with a hypothesis followed by analysis of the data. It is because of this that Data dredging is often described as a process of seeking more information from a data set than it actually contains.
What are examples of databases without boundaries?
The database without boundaries enable people from outside an enterprise to enter and manage records. Initially, Craigslist was developed to aid people in finding apartment and jobs and it was opened up so that the records belongs to the customers as they are the one posting the various advertisement on the sites and replies thus showing how records are being managed externally. This is the same case with social sites such as Facebook and Instagram, where users are able to add their won photos, put up their won adds and perform various functionalities of adding, deleting, and editing.
How do ownership issues affect information management? How do information management needs differ among stakeholder groups?
In an organization ownership issues occurs naturally or through the privacy measures that are adopted. A case in point, salesmen usually can in their own discretion make their sales lead contacts private so that they can maximize their sales and meet their monthly target. However, by design, various department can limit the level of access of information in the spirit of protecting company’s secrets. Various stakeholder groups have various information needs. For example, Government would require to various compliance reports such as tax compliance and environmental compliance reports using the various government definitions. Customers on the other hand require information that shows the various price breakdown and necessary information that should be presented through a simple interface, whereas the top-level management would require information on the implementation of a given strategic plan.