Data Representation in Computer: Number Systems, Characters, Audio, Image and Video

What is data representation in computer.

A computer uses a fixed number of bits to represent a piece of data which could be a number, a character, image, sound, video, etc. Data representation is the method used internally to represent data in a computer. Let us see how various types of data can be represented in computer memory.

Number Systems

Number systems are the technique to represent numbers in the computer system architecture, every value that you are saving or getting into/from computer memory has a defined number system.

The number 289 is pronounced as two hundred and eighty-nine and it consists of the symbols 2, 8, and 9. Similarly, there are other number systems. Each has its own symbols and method for constructing a number.

Let us discuss some of the number systems. Computer architecture supports the following number of systems:

Binary Number System

Octal number system, decimal number system.

The decimal number system has only ten (10) digits from 0 to 9. Every number (value) represents with 0,1,2,3,4,5,6, 7,8 and 9 in this number system. The base of decimal number system is 10, because it has only 10 digits.

Hexadecimal Number System

Data representation of characters.

There are different methods to represent characters . Some of them are discussed below:

Since there are exactly 128 unique combinations of 7 bits, this 7-bit code can represent only128 characters. Another version is ASCII-8, also called extended ASCII, which uses 8 bits for each character, can represent 256 different characters.

If ASCII-coded data is to be used in a computer that uses EBCDIC representation, it is necessary to transform ASCII code to EBCDIC code. Similarly, if EBCDIC coded data is to be used in an ASCII computer, EBCDIC code has to be transformed to ASCII.

Using 8-bit ASCII we can represent only 256 characters. This cannot represent all characters of written languages of the world and other symbols. Unicode is developed to resolve this problem. It aims to provide a standard character encoding scheme, which is universal and efficient.

Data Representation of Audio, Image and Video

In most cases, we may have to represent and process data other than numbers and characters. This may include audio data, images, and videos. We can see that like numbers and characters, the audio, image, and video data also carry information.

For example, an image is most popularly stored in Joint Picture Experts Group (JPEG ) file format. An image file consists of two parts – header information and image data. Information such as the name of the file, size, modified data, file format, etc. is stored in the header part.

Numerous such techniques are used to achieve compression. Depending on the application, images are stored in various file formats such as bitmap file format (BMP), Tagged Image File Format (TIFF), Graphics Interchange Format (GIF), Portable (Public) Network Graphic (PNG).

Similarly, video is also stored in different files such as AVI (Audio Video Interleave) – a file format designed to store both audio and video data in a standard package that allows synchronous audio with video playback, MP3, JPEG-2, WMV, etc.

FAQs About Data Representation in Computer

What is number system with example, you might also like, what is c++ programming language c++ character set, c++ tokens, what is artificial intelligence functions, 6 benefits, applications of ai, what is microprocessor evolution of microprocessor, types, features, 10 evolution of computing machine, history, what are decision making statements in c types, what are c++ keywords set of 59 keywords in c ++, what is cloud computing classification, characteristics, principles, types of cloud providers, generations of computer first to fifth, classification, characteristics, features, examples, types of computer software: systems software, application software, what are operators in c different types of operators in c, types of storage devices, advantages, examples, 10 types of computers | history of computers, advantages, what is operating system functions, types, types of user interface, advantages and disadvantages of operating system, advantages and disadvantages of flowcharts, what is flowchart in programming symbols, advantages, preparation, what is problem solving algorithm, steps, representation, what are data types in c++ types, data and information: definition, characteristics, types, channels, approaches, what are expressions in c types, leave a reply cancel reply.

5. Data Representation

Introduction

Computers are machines that do stuff with information. They let you view, listen, create, and edit information in documents, images, videos, sound, spreadsheets and databases. They let you play games in simulated worlds that don’t really exist except as information inside the computer’s memory and displayed on the screen. They let you compute and calculate with numerical information; they let you send and receive information over networks. Fundamental to all of this is that the computer has to represent that information in some way inside the computer’s memory, as well as storing it on disk or sending it over a network.

Chapter sections

  • 5.1. What's the big picture?
  • 5.2. Getting started
  • 5.3. Numbers
  • 5.5. Images and Colours
  • 5.6. Program Instructions
  • 5.7. The whole story!
  • 5.8. Further reading

data representation in computer studies

Data representation

data representation in computer studies

Computers use binary - the digits 0 and 1 - to store data. A binary digit, or bit, is the smallest unit of data in computing. It is represented by a 0 or a 1. Binary numbers are made up of binary digits (bits), eg the binary number 1001. The circuits in a computer's processor are made up of billions of transistors. A transistor is a tiny switch that is activated by the electronic signals it receives. The digits 1 and 0 used in binary reflect the on and off states of a transistor. Computer programs are sets of instructions. Each instruction is translated into machine code - simple binary codes that activate the CPU. Programmers write computer code and this is converted by a translator into binary instructions that the processor can execute. All software, music, documents, and any other information that is processed by a computer, is also stored using binary. [1]

To include strings, integers, characters and colours. This should include considering the space taken by data, for instance the relation between the hexadecimal representation of colours and the number of colours available.

This video is superb place to understand this topic

  • 1 How a file is stored on a computer
  • 2 How an image is stored in a computer
  • 3 The way in which data is represented in the computer.
  • 6 Standards
  • 7 References

How a file is stored on a computer [ edit ]

How an image is stored in a computer [ edit ]

The way in which data is represented in the computer. [ edit ].

To include strings, integers, characters and colours. This should include considering the space taken by data, for instance the relation between the hexadecimal representation of colours and the number of colours available [3] .

This helpful material is used with gratitude from a computer science wiki under a Creative Commons Attribution 3.0 License [4]

Sound [ edit ]

  • Let's look at an oscilloscope
  • The BBC has an excellent article on how computers represent sound

See Also [ edit ]

Standards [ edit ].

  • Outline the way in which data is represented in the computer.

Computer Science: Reflections on the Field, Reflections from the Field (2004)

Chapter: 5 data, representation, and information, 5 data, representation, and information.

T he preceding two chapters address the creation of models that capture phenomena of interest and the abstractions both for data and for computation that reduce these models to forms that can be executed by computer. We turn now to the ways computer scientists deal with information, especially in its static form as data that can be manipulated by programs.

Gray begins by narrating a long line of research on databases—storehouses of related, structured, and durable data. We see here that the objects of research are not data per se but rather designs of “schemas” that allow deliberate inquiry and manipulation. Gray couples this review with introspection about the ways in which database researchers approach these problems.

Databases support storage and retrieval of information by defining—in advance—a complex structure for the data that supports the intended operations. In contrast, Lesk reviews research on retrieving information from documents that are formatted to meet the needs of applications rather than predefined schematized formats.

Interpretation of information is at the heart of what historians do, and Ayers explains how information technology is transforming their paradigms. He proposes that history is essentially model building—constructing explanations based on available information—and suggests that the methods of computer science are influencing this core aspect of historical analysis.

DATABASE SYSTEMS: A TEXTBOOK CASE OF RESEARCH PAYING OFF

Jim Gray, Microsoft Research

A small research investment helped produce U.S. market dominance in the $14 billion database industry. Government and industry funding of a few research projects created the ideas for several generations of products and trained the people who built those products. Continuing research is now creating the ideas and training the people for the next generation of products.

Industry Profile

The database industry generated about $14 billion in revenue in 2002 and is growing at 20 percent per year, even though the overall technology sector is almost static. Among software sectors, the database industry is second only to operating system software. Database industry leaders are all U.S.-based corporations: IBM, Microsoft, and Oracle are the three largest. There are several specialty vendors: Tandem sells over $1 billion/ year of fault-tolerant transaction processing systems, Teradata sells about $1 billion/year of data-mining systems, and companies like Information Resources Associates, Verity, Fulcrum, and others sell specialized data and text-mining software.

In addition to these well-established companies, there is a vibrant group of small companies specializing in application-specific databases—for text retrieval, spatial and geographical data, scientific data, image data, and so on. An emerging group of companies offer XML-oriented databases. Desktop databases are another important market focused on extreme ease of use, small size, and disconnected (offline) operation.

Historical Perspective

Companies began automating their back-office bookkeeping in the 1960s. The COBOL programming language and its record-oriented file model were the workhorses of this effort. Typically, a batch of transactions was applied to the old-tape-master, producing a new-tape-master and printout for the next business day. During that era, there was considerable experimentation with systems to manage an online database that could capture transactions as they happened. At first these systems were ad hoc, but late in that decade network and hierarchical database products emerged. A COBOL subcommittee defined a network data model stan-

dard (DBTG) that formed the basis for most systems during the 1970s. Indeed, in 1980 DBTG-based Cullinet was the leading software company.

However, there were some problems with DBTG. DBTG uses a low-level, record-at-a-time procedural language to access information. The programmer has to navigate through the database, following pointers from record to record. If the database is redesigned, as often happens over a decade, then all the old programs have to be rewritten.

The relational data model, enunciated by IBM researcher Ted Codd in a 1970 Communications of the Association for Computing Machinery article, 1 was a major advance over DBTG. The relational model unified data and metadata so that there was only one form of data representation. It defined a non-procedural data access language based on algebra or logic. It was easier for end users to visualize and understand than the pointers-and-records-based DBTG model.

The research community (both industry and university) embraced the relational data model and extended it during the 1970s. Most significantly, researchers showed that a non-procedural language could be compiled to give performance comparable to the best record-oriented database systems. This research produced a generation of systems and people that formed the basis for products from IBM, Ingres, Oracle, Informix, Sybase, and others. The SQL relational database language was standardized by ANSI/ISO between 1982 and 1986. By 1990, virtually all database systems provided an SQL interface (including network, hierarchical, and object-oriented systems).

Meanwhile the database research agenda moved on to geographically distributed databases and to parallel data access. Theoretical work on distributed databases led to prototypes that in turn led to products. Today, all the major database systems offer the ability to distribute and replicate data among nodes of a computer network. Intense research on data replication during the late 1980s and early 1990s gave rise to a second generation of replication products that are now the mainstays of mobile computing.

Research of the 1980s showed how to execute each of the relational data operators in parallel—giving hundred-fold and thousand-fold speedups. The results of this research began to appear in the products of several major database companies. With the proliferation of data mining in the 1990s, huge databases emerged. Interactive access to these databases requires that the system use multiple processors and multiple disks to read all the data in parallel. In addition, these problems require near-

  

E.F. Codd, 1970, “A Relational Model of Data from Large Shared Data Banks,” 13(6):377-387. Available online at .

linear time search algorithms. University and industrial research of the previous decade had solved these problems and forms the basis of the current VLDB (very large database) data-mining systems.

Rollup and drilldown data reporting systems had been a mainstay of decision-support systems ever since the 1960s. In the middle 1990s, the research community really focused on data-mining algorithms. They invented very efficient data cube and materialized view algorithms that form the basis for the current generation of business intelligence products.

The most recent round of government-sponsored research creating a new industry comes from the National Science Foundation’s Digital Libraries program, which spawned Google. It was founded by a group of “database” graduate students who took a fresh look at how information should be organized and presented in the Internet era.

Current Research Directions

There continues to be active and valuable research on representing and indexing data, adding inference to data search, compiling queries more efficiently, executing queries in parallel, integrating data from heterogeneous data sources, analyzing performance, and extending the transaction model to handle long transactions and workflow (transactions that involve human as well as computer steps). The availability of huge volumes of data on the Internet has prompted the study of data integration, mediation, and federation in which a portal system presents a unification of several data sources by pulling data on demand from different parts of the Internet.

In addition, there is great interest in unifying object-oriented concepts with the relational model. New data types (image, document, and drawing) are best viewed as the methods that implement them rather than by the bytes that represent them. By adding procedures to the database system, one gets active databases, data inference, and data encapsulation. This object-oriented approach is an area of active research and ferment both in academe and industry. It seems that in 2003, the research prototypes are mostly done and this is an area that is rapidly moving into products.

The Internet is full of semi-structured data—data that has a bit of schema and metadata, but is mostly a loose collection of facts. XML has emerged as the standard representation of semi-structured data, but there is no consensus on how such data should be stored, indexed, or searched. There have been intense research efforts to answer these questions. Prototypes have been built at universities and industrial research labs, and now products are in development.

The database research community now has a major focus on stream data processing. Traditionally, databases have been stored locally and are

updated by transactions. Sensor networks, financial markets, telephone calls, credit card transactions, and other data sources present streams of data rather than a static database. The stream data processing researchers are exploring languages and algorithms for querying such streams and providing approximate answers.

Now that nearly all information is online, data security and data privacy are extremely serious and important problems. A small, but growing, part of the database community is looking at ways to protect people’s privacy by limiting the ways data is used. This work also has implications for protecting intellectual property (e.g., digital rights management, watermarking) and protecting data integrity by digitally signing documents and then replicating them so that the documents cannot be altered or destroyed.

Case Histories

The U.S. government funded many database research projects from 1972 to the present. Projects at the University of California at Los Angeles gave rise to Teradata and produced many excellent students. Projects at Computer Corp. of America (SDD-1, Daplex, Multibase, and HiPAC) pioneered distributed database technology and object-oriented database technology. Projects at Stanford University fostered deductive database technology, data integration technology, query optimization technology, and the popular Yahoo! and Google Internet sites. Work at Carnegie Mellon University gave rise to general transaction models and ultimately to the Transarc Corporation. There have been many other successes from AT&T, the University of Texas at Austin, Brown and Harvard Universities, the University of Maryland, the University of Michigan, Massachusetts Institute of Technology, Princeton University, and the University of Toronto among others. It is not possible to enumerate all the contributions here, but we highlight three representative research projects that had a major impact on the industry.

Project INGRES

Project Ingres started at the University of California at Berkeley in 1972. Inspired by Codd’s paper on relational databases, several faculty members (Stonebraker, Rowe, Wong, and others) started a project to design and build a relational system. Incidental to this work, they invented a query language (QUEL), relational optimization techniques, a language binding technique, and interesting storage strategies. They also pioneered work on distributed databases.

The Ingres academic system formed the basis for the Ingres product now owned by Computer Associates. Students trained on Ingres went on

to start or staff all the major database companies (AT&T, Britton Lee, HP, Informix, IBM, Oracle, Tandem, Sybase). The Ingres project went on to investigate distributed databases, database inference, active databases, and extensible databases. It was rechristened Postgres, which is now the basis of the digital library and scientific database efforts within the University of California system. Recently, Postgres spun off to become the basis for a new object-relational system from the start-up Illustra Information Technologies.

Codd’s ideas were inspired by seeing the problems IBM and its customers were having with IBM’s IMS product and the DBTG network data model. His relational model was at first very controversial; people thought that the model was too simplistic and that it could never give good performance. IBM Research management took a gamble and chartered a small (10-person) systems effort to prototype a relational system based on Codd’s ideas. That system produced a prototype that eventually grew into the DB2 product series. Along the way, the IBM team pioneered ideas in query optimization, data independence (views), transactions (logging and locking), and security (the grant-revoke model). In addition, the SQL query language from System R was the basis for the ANSI/ISO standard.

The System R group went on to investigate distributed databases (project R*) and object-oriented extensible databases (project Starburst). These research projects have pioneered new ideas and algorithms. The results appear in IBM’s database products and those of other vendors.

Not all research ideas work out. During the 1970s there was great enthusiasm for database machines—special-purpose computers that would be much faster than general-purpose operating systems running conventional database systems. These research projects were often based on exotic hardware like bubble memories, head-per-track disks, or associative RAM. The problem was that general-purpose systems were improving at 50 percent per year, so it was difficult for exotic systems to compete with them. By 1980, most researchers realized the futility of special-purpose approaches and the database-machine community switched to research on using arrays of general-purpose processors and disks to process data in parallel.

The University of Wisconsin hosted the major proponents of this idea in the United States. Funded by the government and industry, those researchers prototyped and built a parallel database machine called

Gamma. That system produced ideas and a generation of students who went on to staff all the database vendors. Today the parallel systems from IBM, Tandem, Oracle, Informix, Sybase, and Microsoft all have a direct lineage from the Wisconsin research on parallel database systems. The use of parallel database systems for data mining is the fastest-growing component of the database server industry.

The Gamma project evolved into the Exodus project at Wisconsin (focusing on an extensible object-oriented database). Exodus has now evolved to the Paradise system, which combines object-oriented and parallel database techniques to represent, store, and quickly process huge Earth-observing satellite databases.

And Then There Is Science

In addition to creating a huge industry, database theory, science, and engineering constitute a key part of computer science today. Representing knowledge within a computer is one of the central challenges of computer science ( Box 5.1 ). Database research has focused primarily on this fundamental issue. Many universities have faculty investigating these problems and offer classes that teach the concepts developed by this research program.


How can knowledge be represented so that algorithms can make new inferences from the knowledge base? This problem has challenged philosophers for millennia. There has been progress. Euclid axiomized geometry and proved its basic theorems, and in doing so implicitly demonstrated mechanical reasoning from first principles. George Boole’s Laws of Thought created a predicate calculus, and Laplace’s work on probability was a first start on statistical inference.

Each of these threads—proofs, predicate calculus, and statistical inference—were major advances; but each requires substantial human creativity to fit new problems to the solution. Wouldn’t it be nice if we could just put all the books and journals in a library that would automatically organize them and start producing new answers?

There are huge gaps between our current tools and the goal of a self-organizing library, but computer scientists are trying to fill the gaps with better algorithms and better ways of representing knowledge. Databases are one branch of this effort to represent information and reason about it. The database community has taken a bottom-up approach, working with simple data representations and developing a calculus for asking and answering questions about the database.

The fundamental approach of database researchers is to insist that the information must be schematized—the information must be represented in a predefined schema that assigns a meaning to each value. The author-title-subject-abstract schema of a library system is a typical example of this approach. The schema is used both to organize the data and to make it easy to express questions about the database.

Database researchers have labored to make it easy to define the schema, easy to add data to the database, and easy to pose questions to the database. Early database systems were dreadfully difficult to use—largely because we lacked the algorithms to automatically index huge databases and lacked powerful query tools. Today there are good tools to define schemas, and graphical tools that make it easy to explore and analyze the contents of a database.

This has required invention at all levels of the problem. At the lowest levels we had to discover efficient algorithms to sort, index, and organize numeric, text, temporal, and spatial information so that higher-level software could just pick from a wide variety of organizations and algorithms. These low-level algorithms mask data placement so that it can be spread among hundreds or thousands of disks; they mask concurrency so that the higher-level software can view a consistent data snapshot, even though the data is in flux. The low-level software includes enough redundancy so that once data is placed in the database, it is safe to assume that the data will never be lost. One major advance was the theory and algorithms to automatically guarantee these concurrency-reliability properties.

Text, spatial, and temporal databases have always posed special challenges. Certainly there have been huge advances in indexing these databases, but researchers still have many more problems to solve. The advent of image, video, and sound databases raises new issues. In particular, we are now able to extract a huge number of features from images and sounds, but we have no really good ways to index these features. This is just another aspect of the “curse of dimensionality” faced by database systems in the data-mining and data analysis area. When each object has more than a dozen attributes, traditional indexing techniques give little help in reducing the approximate search space.

So, there are still many unsolved research challenges for the low-level database “plumbers.”

The higher-level software that uses this plumbing has been a huge success. Early on, the research community embraced the relational data model championed by Ted Codd. Codd advocated the use of non-procedural set-oriented programming to define schemas and to pose queries. After a decade of experimentation, these research ideas evolved into the SQL database language. Having this high-level non-procedural language was a boon both to application programmers and to database implementers. Application programmers could write much simpler programs. The database implementers faced the challenge of optimizing and executing SQL. Because it is so high level (SQL is a non-procedural functional dataflow language), SQL allows data to be distributed across many computers and disks. Because the programs do not mention any physical structures, the implementer is free to use whatever “plumbing” is available. And because the language is functional, it can be executed in parallel.

Techniques for implementing the relational data model and algorithms for efficiently executing database queries remain a core part of the database research agenda. Over the last decade, the traditional database systems have grown to include analytics (data cubes), and also data-mining algorithms borrowed from the machine-learning and statistics communities. There is increasing interest in solving information retrieval and multimedia database issues.

Today, there are very good tools for defining and querying traditional database systems; but, there are still major research challenges in the traditional database field. The major focus is automating as much of the data administration tasks as possible—making the database system self-healing and self-managing.

We are still far from the goal of building systems that automatically ingest information, reason about it, and produce answers on demand. But the goal is closer, and it seems attainable within this century.

COMPUTER SCIENCE IS TO INFORMATION AS CHEMISTRY IS TO MATTER

Michael Lesk, Rutgers University

In other countries computer science is often called “informatics” or some similar name. Much computer science research derives from the need to access, process, store, or otherwise exploit some resource of useful information. Just as chemistry is driven to large extent by the need to understand substances, computing is driven by a need to handle data and information. As an example of the way chemistry has developed, see Oliver Sacks’s book Uncle Tungsten: Memories of a Chemical Boyhood (Vintage Books, 2002). He describes his explorations through the different metals, learning the properties of each, and understanding their applications. Similarly, in the history of computer science, our information needs and our information capabilities have driven parts of the research agenda. Information retrieval systems take some kind of information, such as text documents or pictures, and try to retrieve topics or concepts based on words or shapes. Deducing the concept from the bytes can be difficult, and the way we approach the problem depends on what kind of bytes we have and how many of them we have.

Our experimental method is to see if we can build a system that will provide some useful access to information or service. If it works, those algorithms and that kind of data become a new field: look at areas like geographic information systems. If not, people may abandon the area until we see a new motivation to exploit that kind of data. For example, face-recognition algorithms have received a new impetus from security needs, speeding up progress in the last few years. An effective strategy to move computer science forward is to provide some new kind of information and see if we can make it useful.

Chemistry, of course, involves a dichotomy between substances and reactions. Just as we can (and frequently do) think of computer science in terms of algorithms, we can talk about chemistry in terms of reactions. However, chemistry has historically focused on substances: the encyclopedias and indexes in chemistry tend to be organized and focused on compounds, with reaction names and schemes getting less space on the shelf. Chemistry is becoming more balanced as we understand reactions better; computer science has always been more heavily oriented toward algorithms, but we cannot ignore the driving force of new kinds of data.

The history of information retrieval, for example, has been driven by the kinds of information we could store and use. In the 1960s, for example, storage was extremely expensive. Research projects were limited to text

materials. Even then, storage costs meant that a research project could just barely manage to have a single ASCII document available for processing. For example, Gerard Salton’s SMART system, one of the leading text retrieval systems for many years (see Salton’s book, The SMART Automatic Retrieval System , Prentice-Hall, 1971), did most of its processing on collections of a few hundred abstracts. The only collections of “full documents” were a collection of 80 extended abstracts, each a page or two long, and a collection of under a thousand stories from Time Magazine , each less than a page in length. The biggest collection was 1400 abstracts in aeronautical engineering. With this data, Salton was able to experiment on the effectiveness of retrieval methods using suffixing, thesauri, and simple phrase finding. Salton also laid down the standard methodology for evaluating retrieval systems, based on Cyril Cleverdon’s measures of “recall” (percentage of the relevant material that is retrieved in response to a query) and “precision” (the percentage of the material retrieved that is relevant). A system with perfect recall finds all relevant material, making no errors of omission and leaving out nothing the user wanted. In contrast, a system with perfect precision finds only relevant material, making no errors of commission and not bothering the user with stuff of no interest. The SMART system produced these measures for many retrieval experiments and its methodology was widely used, making text retrieval one of the earliest areas of computer science with agreed-on evaluation methods. Salton was not able to do anything with image retrieval at the time; there were no such data available for him.

Another idea shaped by the amount of information available was “relevance feedback,” the idea of identifying useful documents from a first retrieval pass in order to improve the results of a later retrieval. With so few documents, high precision seemed like an unnecessary goal. It was simply not possible to retrieve more material than somebody could look at. Thus, the research focused on high recall (also stimulated by the insistence by some users that they had to have every single relevant document). Relevance feedback helped recall. By contrast, the use of phrase searching to improve precision was tried but never got much attention simply because it did not have the scope to produce much improvement in the running systems.

The basic problem is that we wish to search for concepts, and what we have in natural language are words and phrases. When our documents are few and short, the main problem is not to miss any, and the research at the time stressed algorithms that found related words via associations or improved recall with techniques like relevance feedback.

Then, of course, several other advances—computer typesetting and word processing to generate material and cheap disks to hold it—led to much larger text collections. Figure 5.1 shows the decline in the price of

data representation in computer studies

FIGURE 5.1 Decline in the price of disk space, 1950 to 2004.

disk space since the first disks in the mid-1950s, generally following the cost-performance trends of Moore’s law.

Cheaper storage led to larger and larger text collections online. Now there are many terabytes of data on the Web. These vastly larger volumes mean that precision has now become more important, since a common problem is to wade through vastly too many documents. Not surprisingly, in the mid-1980s efforts started on separating the multiple meanings of words like “bank” or “pine” and became the research area of “sense disambiguation.” 2 With sense disambiguation, it is possible to imagine searching for only one meaning of an ambiguous word, thus avoiding many erroneous retrievals.

Large-scale research on text processing took off with the availability of the TREC (Text Retrieval Evaluation Conference) data. Thanks to the National Institute of Standards and Technology, several hundred megabytes of text were provided (in each of several years) for research use. This stimulated more work on query analysis, text handling, searching

  

See Michael Lesk, 1986, “How to Tell a Pine Cone from an Ice Cream Cone,” , pp. 26-28.

algorithms, and related areas; see the series titled TREC Conference Proceedings, edited by Donna Harmon of NIST.

Document clustering appeared as an important way to shorten long search results. Clustering enables a system to report not, say, 5000 documents but rather 10 groups of 500 documents each, and the user can then explore the group or groups that seem relevant. Salton anticipated the future possibility of such algorithms, as did others. 3 Until we got large collections, though, clustering did not find application in the document retrieval world. Now one routinely sees search engines using these techniques, and faster clustering algorithms have been developed.

Thus the algorithms explored switched from recall aids to precision aids as the quantity of available data increased. Manual thesauri, for example, have dropped out of favor for retrieval, partly because of their cost but also because their goal is to increase recall, which is not today’s problem. In terms of finding the concepts hinted at by words and phrases, our goals now are to sharpen rather than broaden these concepts: thus disambiguation and phrase matching, and not as much work on thesauri and term associations.

Again, multilingual searching started to matter, because multilingual collections became available. Multilingual research shows a more precise example of particular information resources driving research. The Canadian government made its Parliamentary proceedings (called Hansard ) available in both French and English, with paragraph-by-paragraph translation. This data stimulated a number of projects looking at how to handle bilingual material, including work on automatic alignment of the parallel texts, automatic linking of similar words in the two languages, and so on. 4

A similar effect was seen with the Brown corpus of tagged English text, where the part of speech of each word (e.g., whether a word is a noun or a verb) was identified. This produced a few years of work on algorithms that learned how to assign parts of speech to words in running text based on statistical techniques, such as the work by Garside. 5

  

See, for example, N. Jardine and C.J. van Rijsbergen, 1971, “The Use of Hierarchical Clustering in Information Retrieval,” 7:217-240.

  

See, for example, T.K. Landauer and M.L. Littman, 1990, “Fully Automatic Cross-Language Document Retrieval Using Latent Semantic Indexing,” pp. 31-38, University of Waterloo Centre for the New OED and Text Research, Waterloo, Ontario, October; or I. Dagan and Ken Church, 1997, “Termight: Coordinating Humans and Machines in Bilingual Terminology Acquisition,” 12(1/2):89-107.

  

Roger Garside, 1987, “The CLAWS Word-tagging System,” in R. Garside, G. Leech, and G. Sampson (eds.), , Longman, London.

One might see an analogy to various new fields of chemistry. The recognition that pesticides like DDT were environmental pollutants led to a new interest in biodegradability, and the Freon propellants used in aerosol cans stimulated research in reactions in the upper atmosphere. New substances stimulated a need to study reactions that previously had not been a top priority for chemistry and chemical engineering.

As storage became cheaper, image storage was now as practical as text storage had been a decade earlier. Starting in the 1980s we saw the IBM QBIC project demonstrating that something could be done to retrieve images directly, without having to index them by text words first. 6 Projects like this were stimulated by the availability of “clip art” such as the COREL image disks. Several different projects were driven by the easy access to images in this way, with technology moving on from color and texture to more accurate shape processing. At Berkeley, for example, the “Blobworld” project made major improvements in shape detection and recognition, as described in Carson et al. 7 These projects demonstrated that retrieval could be done with images as well as with words, and that properties of images could be found that were usable as concepts for searching.

Another new kind of data that became feasible to process was sound, in particular human speech. Here it was the Defense Advanced Research Projects Agency (DARPA) that took the lead, providing the SWITCH-BOARD corpus of spoken English. Again, the availability of a substantial file of tagged information helped stimulate many research projects that used this corpus and developed much of the technology that eventually went into the commercial speech recognition products we now have. As with the TREC contests, the competitions run by DARPA based on its spoken language data pushed the industry and the researchers to new advances. National needs created a new technology; one is reminded of the development of synthetic rubber during World War II or the advances in catalysis needed to make explosives during World War I.

Yet another kind of new data was geo-coded data, introducing a new set of conceptual ideas related to place. Geographical data started showing up in machine-readable form during the 1980s, especially with the release of the Dual Independent Map Encoding (DIME) files after the 1980

  

See, for example, Wayne Niblack, Ron Barber, William Equitz, Myron Flickner, Eduardo H. Glasman, Dragutin Petkovic, Peter Yanker, Christos Faloutsos, and Gabriel Taubin, 1993, “The QBIC Project: Querying Images by Content, Using Color, Texture, and Shape,” pp. 173-187.

  

C. Carson, M. Thomas, S. Belongie, J.M. Hellerstein, and J. Malik, 1999, “Blobworld: A System for Region-based Image Indexing and Retrieval,” , Springer-Verlag, Amsterdam, pp. 509-516.

census and the Topologically Integrated Geographic Encoding and Referencing (TIGER) files from the 1990 census. The availability, free of charge, of a complete U.S. street map stimulated much research on systems to display maps, to give driving directions, and the like. 8 When aerial photographs also became available, there was the triumph of Microsoft’s “Terraserver,” which made it possible to look at a wide swath of the world from the sky along with correlated street and topographic maps. 9

More recently, in the 1990s, we have started to look at video search and retrieval. After all, if a CD-ROM contains about 300,000 times as many bytes per pound as a deck of punched cards, and a digitized video has about 500,000 times as many bytes per second as the ASCII script it comes from, we should be about where we were in the 1960s with video today. And indeed there are a few projects, most notably the Informedia project at Carnegie Mellon University, that experiment with video signals; they do not yet have ways of searching enormous collections, but they are developing algorithms that exploit whatever they can find in the video: scene breaks, closed-captioning, and so on.

Again, there is the problem of deducing concepts from a new kind of information. We started with the problem of words in one language needing to be combined when synonymous, picked apart when ambiguous, and moved on to detecting synonyms across multiple languages and then to concepts depicted in pictures and sounds. Now we see research such as that by Jezekiel Ben-Arie associating words like “run” or “hop” with video images of people doing those actions. In the same way we get again new chemistry when molecules like “buckyballs” are created and stimulate new theoretical and reaction studies.

Defining concepts for search can be extremely difficult. For example, despite our abilities to parse and define every item in a computer language, we have made no progress on retrieval of software; people looking for search or sort routines depend on metadata or comments. Some areas seem more flexible than others: text and naturalistic photograph processing software tends to be very general, while software to handle CAD diagrams and maps tends to be more specific. Algorithms are sometimes portable; both speech processing and image processing need Fourier transforms, but the literature is less connected than one might like (partly

  

An early publication was R. Elliott and M. Lesk, 1982, “Route Finding in Street Maps by Computers and People,” , Pittsburgh, Pa., August, pp. 258-261.

  

T. Barclay, J. Gray, and D. Slutz, 2000, “Microsoft Terraserver: A Spatial Data Warehouse,” , Association for Computing Machinery, New York, pp. 307-318.

because of the difference between one-dimensional and two-dimensional transforms).

There are many other examples of interesting computer science research stimulated by the availability of particular kinds of information. Work on string matching today is often driven by the need to align sequences in either protein or DNA data banks. Work on image analysis is heavily influenced by the need to deal with medical radiographs. And there are many other interesting projects specifically linked to an individual data source. Among examples:

The British Library scanning of the original manuscript of Beowulf in collaboration with the University of Kentucky, working on image enhancement until the result of the scanning is better than reading the original;

The Perseus project, demonstrating the educational applications possible because of the earlier Thesaurus Linguae Graecae project, which digitized all the classical Greek authors;

The work in astronomical analysis stimulated by the Sloan Digital Sky Survey;

The creation of the field of “forensic paleontology” at the University of Texas as a result of doing MRI scans of fossil bones;

And, of course, the enormous amount of work on search engines stimulated by the Web.

When one of these fields takes off, and we find wide usage of some online resource, it benefits society. Every university library gained readers as their catalogs went online and became accessible to students in their dorm rooms. Third World researchers can now access large amounts of technical content their libraries could rarely acquire in the past.

In computer science, and in chemistry, there is a tension between the algorithm/reaction and the data/substance. For example, should one look up an answer or compute it? Once upon a time logarithms were looked up in tables; today we also compute them on demand. Melting points and other physical properties of chemical substances are looked up in tables; perhaps with enough quantum mechanical calculation we could predict them, but it’s impractical for most materials. Predicting tomorrow’s weather might seem a difficult choice. One approach is to measure the current conditions, take some equations that model the atmosphere, and calculate forward a day. Another is to measure the current conditions, look in a big database for the previous day most similar to today, and then take the day after that one as the best prediction for tomorrow. However, so far the meteorologists feel that calculation is better. Another complicated example is chess: given the time pressure of chess tournaments

against speed and storage available in computers, chess programs do the opening and the endgame by looking in tables of old data and calculate for the middle game.

To conclude, a recipe for stimulating advances in computer science is to make some data available and let people experiment with it. With the incredibly cheap disks and scanners available today, this should be easier than ever. Unfortunately, what we gain with technology we are losing to law and economics. Many large databases are protected by copyright; few motion pictures, for example, are old enough to have gone out of copyright. Content owners generally refuse to grant permission for wide use of their material, whether out of greed or fear: they may have figured out how to get rich off their files of information or they may be afraid that somebody else might have. Similarly it is hard to get permission to digitize in-copyright books, no matter how long they have been out of print. Jim Gray once said to me, “May all your problems be technical.” In the 1960s I was paying people to key in aeronautical abstracts. It never occurred to us that we should be asking permission of the journals involved (I think what we did would qualify as fair use, but we didn’t even think about it). Today I could scan such things much more easily, but I would not be able to get permission. Am I better off or worse off?

There are now some 22 million chemical substances in the Chemical Abstracts Service Registry and 7 million reactions. New substances continue to intrigue chemists and cause research on new reactions, with of course enormous interest in biochemistry both for medicine and agriculture. Similarly, we keep adding data to the Web, and new kinds of information (photographs of dolphins, biological flora, and countless other things) can push computer scientists to new algorithms. In both cases, synthesis of specific instances into concepts is a crucial problem. As we see more and more kinds of data, we learn more about how to extract meaning from it, and how to present it, and we develop a need for new algorithms to implement this knowledge. As the data gets bigger, we learn more about optimization. As it gets more complex, we learn more about representation. And as it gets more useful, we learn more about visualization and interfaces, and we provide better service to society.

HISTORY AND THE FUNDAMENTALS OF COMPUTER SCIENCE

Edward L. Ayers, University of Virginia

We might begin with a thought experiment: What is history? Many people, I’ve discovered, think of it as books and the things in books. That’s certainly the explicit form in which we usually confront history. Others, thinking less literally, might think of history as stories about the past; that would open us to oral history, family lore, movies, novels, and the other forms in which we get most of our history.

All these images are wrong, of course, in the same way that images of atoms as little solar systems are wrong, or pictures of evolution as profiles of ever taller and more upright apes and people are wrong. They are all models, radically simplified, that allow us to think about such things in the exceedingly small amounts of time that we allot to these topics.

The same is true for history, which is easiest to envision as technological progress, say, or westward expansion, of the emergence of freedom—or of increasing alienation, exploitation of the environment, or the growth of intrusive government.

Those of us who think about specific aspects of society or nature for a living, of course, are never satisfied with the stories that suit the purposes of everyone else so well.

We are troubled by all the things that don’t fit, all the anomalies, variance, and loose ends. We demand more complex measurement, description, and fewer smoothing metaphors and lowest common denominators.

Thus, to scientists, atoms appear as clouds of probability; evolution appears as a branching, labyrinthine bush in which some branches die out and others diversify. It can certainly be argued that past human experience is as complex as anything in nature and likely much more so, if by complexity we mean numbers of components, variability of possibilities, and unpredictability of outcomes.

Yet our means of conveying that complexity remain distinctly analog: the story, the metaphor, the generalization. Stories can be wonderfully complex, of course, but they are complex in specific ways: of implication, suggestion, evocation. That’s what people love and what they remember.

But maybe there is a different way of thinking about the past: as information. In fact, information is all we have. Studying the past is like studying scientific processes for which you have the data but cannot run the experiment again, in which there is no control, and in which you can never see the actual process you are describing and analyzing. All we have is information in various forms: words in great abundance, billions of numbers, millions of images, some sounds and buildings, artifacts.

The historian’s goal, it seems to me, should be to account for as much of the complexity embedded in that information as we can. That, it appears, is what scientists do, and it has served them well.

And how has science accounted for ever-increasing amounts of complexity in the information they use? Through ever more sophisticated instruments. The connection between computer science and history could be analogous to that between telescopes and stars, microscopes and cells. We could be on the cusp of a new understanding of the patterns of complexity in human behavior of the past.

The problem may be that there is too much complexity in that past, or too much static, or too much silence. In the sciences, we’ve learned how to filter, infer, use indirect evidence, and fill in the gaps, but we have a much more literal approach to the human past.

We have turned to computer science for tasks of more elaborate description, classification, representation. The digital archive my colleagues and I have built, the Valley of the Shadow Project, permits the manipulation of millions of discrete pieces of evidence about two communities in the era of the American Civil War. It uses sorting mechanisms, hypertextual display, animation, and the like to allow people to handle the evidence of this part of the past for themselves. This isn’t cutting-edge computer science, of course, but it’s darned hard and deeply disconcerting to some, for it seems to abdicate responsibility, to undermine authority, to subvert narrative, to challenge story.

Now, we’re trying to take this work to the next stage, to analysis. We have composed a journal article that employs an array of technologies, especially geographic information systems and statistical analysis in the creation of the evidence. The article presents its argument, evidence, and historiographical context as a complex textual, tabular, and graphical representation. XML offers a powerful means to structure text and XSL an even more powerful means to transform it and manipulate its presentation. The text is divided into sections called “statements,” each supported with “explanation.” Each explanation, in turn, is supported by evidence and connected to relevant historiography.

Linkages, forward and backward, between evidence and narrative are central. The historiography can be automatically sorted by author, date, or title; the evidence can be arranged by date, topic, or type. Both evidence and historiographical entries are linked to the places in the analysis where they are invoked. The article is meant to be used online, but it can be printed in a fixed format with all the limitations and advantages of print.

So, what are the implications of thinking of the past in the hardheaded sense of admitting that all we really have of the past is information? One implication might be great humility, since all we have for most

of the past are the fossils of former human experience, words frozen in ink and images frozen in line and color. Another implication might be hubris: if we suddenly have powerful new instruments, might we be on the threshold of a revolution in our understanding of the past? We’ve been there before.

A connection between history and social science was tried before, during the first days of accessible computers. Historians taught themselves statistical methods and even programming languages so that they could adopt the techniques, models, and insights of sociology and political science. In the 1950s and 1960s the creators of the new political history called on historians to emulate the precision, explicitness, replicability, and inclusivity of the quantitative social sciences. For two decades that quantitative history flourished, promising to revolutionize the field. And to a considerable extent it did: it changed our ideas of social mobility, political identification, family formation, patterns of crime, economic growth, and the consequences of ethnic identity. It explicitly linked the past to the present and held out a history of obvious and immediate use.

But that quantitative social science history collapsed suddenly, the victim of its own inflated claims, limited method and machinery, and changing academic fashion. By the mid-1980s, history, along with many of the humanities and social sciences, had taken the linguistic turn. Rather than software manuals and codebooks, graduate students carried books of French philosophy and German literary interpretation. The social science of choice shifted from sociology to anthropology; texts replaced tables. A new generation defined itself in opposition to social scientific methods just as energetically as an earlier generation had seen in those methods the best means of writing a truly democratic history. The first computer revolution largely failed.

The first effort at that history fell into decline in part because historians could not abide the distance between their most deeply held beliefs and what the statistical machinery permitted, the abstraction it imposed. History has traditionally been built around contingency and particularity, but the most powerful tools of statistics are built on sampling and extrapolation, on generalization and tendency. Older forms of social history talked about vague and sometimes dubious classifications in part because that was what the older technology of tabulation permitted us to see. It has become increasingly clear across the social sciences that such flat ways of describing social life are inadequate; satisfying explanations must be dynamic, interactive, reflexive, and subtle, refusing to reify structures of social life or culture. The new technology permits a new cross-fertilization.

Ironically, social science history faded just as computers became widely available, just as new kinds of social science history became feasible. No longer is there any need for white-coated attendants at huge mainframes

and expensive proprietary software. Rather than reducing people to rows and columns, searchable databases now permit researchers to maintain the identities of individuals in those databases and to represent entire populations rather than samples. Moreover, the record can now include things social science history could only imagine before the Web: completely indexed newspapers, with the original readable on the screen; completely searchable letters and diaries by the thousands; and interactive maps with all property holders identified and linked to other records. Visualization of patterns in the data, moreover, far outstrips the possibilities of numerical calculation alone. Manipulable histograms, maps, and time lines promise a social history that is simultaneously sophisticated and accessible. We have what earlier generations of social science historians dreamed of: a fast and widely accessible network linked to cheap and powerful computers running common software with well-established standards for the handling of numbers, texts, and images. New possibilities of collaboration and cumulative research beckon. Perhaps the time is right to reclaim a worthy vision of a disciplined and explicit social scientific history that we abandoned too soon.

What does this have to do with computer science? Everything, it seems to me. If you want hard problems, historians have them. And what’s the hardest problem of all right now? The capture of the very information that is history. Can computer science imagine ways to capture historical information more efficiently? Can it offer ways to work with the spotty, broken, dirty, contradictory, nonstandardized information we work with?

The second hard problem is the integration of this disparate evidence in time and space, offering new precision, clarity, and verifiability, as well as opening new questions and new ways of answering them.

If we can think of these ways, then we face virtually limitless possibilities. Is there a more fundamental challenge or opportunity for computer science than helping us to figure out human society over human time?

This page intentionally left blank.

Computer Science: Reflections on the Field, Reflections from the Field provides a concise characterization of key ideas that lie at the core of computer science (CS) research. The book offers a description of CS research recognizing the richness and diversity of the field. It brings together two dozen essays on diverse aspects of CS research, their motivation and results. By describing in accessible form computer science's intellectual character, and by conveying a sense of its vibrancy through a set of examples, the book aims to prepare readers for what the future might hold and help to inspire CS researchers in its creation.

READ FREE ONLINE

Welcome to OpenBook!

You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

Do you want to take a quick tour of the OpenBook's features?

Show this book's table of contents , where you can jump to any chapter by name.

...or use these buttons to go back to the previous chapter or skip to the next one.

Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

Switch between the Original Pages , where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

To search the entire text of this book, type in your search term here and press Enter .

Share a link to this book page on your preferred social network or via email.

View our suggested citation for this chapter.

Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

Get Email Updates

Do you enjoy reading reports from the Academies online for free ? Sign up for email notifications and we'll let you know about new publications in your areas of interest when they're released.

Learning Materials

  • Business Studies
  • Combined Science
  • Computer Science
  • Engineering
  • English Literature
  • Environmental Science
  • Human Geography
  • Macroeconomics
  • Microeconomics
  • Data Representation in Computer Science

Dive deep into the realm of Computer Science with this comprehensive guide about data representation. Data representation, a fundamental concept in computing, refers to the various ways that information can be expressed digitally. The interpretation of this data plays a critical role in decision-making procedures in businesses and scientific research. Gain an understanding of binary data representation, the backbone of digital computing. 

Millions of flashcards designed to help you ace your studies

  • Cell Biology

What is data representation in computer science?

What are some of the fundamental concepts to understand when dealing with data representation?

Why is data representation crucial in computer science?

What is the relationship between data representation and the binary system in computer systems?

What are the larger sets into which binary digits or bits are grouped for efficiency and order?

How does data interpretation contribute to the functionality of computer systems and services?

What is the binary numeral system used in computing and what are its digits called?

What is the formula for interpreting a binary number and give an example of its application?

How is binary data representation applied when typing a character into a document?

What is a binary tree in data structures?

What are different types of a binary tree?

Review generated flashcards

to start learning or create your own AI flashcards

Start learning or create your own AI flashcards

  • Algorithms in Computer Science
  • Computer Network
  • Computer Organisation and Architecture
  • Computer Programming
  • Computer Systems
  • Analogue Signal
  • Binary Arithmetic
  • Binary Conversion
  • Binary Number System
  • Bitmap Graphics
  • Data Compression
  • Data Encoding
  • Digital Signal
  • Hexadecimal Conversion
  • Hexadecimal Number System
  • Huffman Coding
  • Image Representation
  • Lempel Ziv Welch
  • Logic Circuits
  • Lossless Compression
  • Lossy Compression
  • Numeral Systems
  • Quantisation
  • Run Length Encoding
  • Sample Rate
  • Sampling Informatics
  • Sampling Theorem
  • Signal Processing
  • Sound Representation
  • Two's Complement
  • What is ASCII
  • What is Unicode
  • What is Vector Graphics
  • Data Structures
  • Functional Programming
  • Issues in Computer Science
  • Problem Solving Techniques
  • Theory of Computation

Binary data representation uses a system of numerical notation that has just two possible states represented by 0 and 1 (also known as 'binary digits' or 'bits'). Grasp the practical applications of binary data representation and explore its benefits.

Finally, explore the vast world of data model representation. Different types of data models offer a variety of ways to organise data in databases . Understand the strategic role of data models in data representation, and explore how they are used to design efficient database systems. This comprehensive guide positions you at the heart of data representation in Computer Science .

Understanding Data Representation in Computer Science

In the realm of Computer Science, data representation plays a paramount role. It refers to the methods or techniques used to represent, or express information in a computer system. This encompasses everything from text and numbers to images, audio, and beyond.

Basic Concepts of Data Representation

Data representation in computer science is about how a computer interprets and functions with different types of information. Different information types require different representation techniques. For instance, a video will be represented differently than a text document.

When working with various forms of data, it is important to grasp a fundamental understanding of:

  • Binary system
  • Bits and Bytes
  • Number systems: decimal, hexadecimal
  • Character encoding: ASCII, Unicode

Data in a computer system is represented in binary format, as a sequence of 0s and 1s, denoting 'off' and 'on' states respectively. The smallest component of this binary representation is known as a bit , which stands for 'binary digit'.

A byte , on the other hand, generally encompasses 8 bits. An essential aspect of expressing numbers and text in a computer system, are the decimal and hexadecimal number systems, and character encodings like ASCII and Unicode.

Role of Data Representation in Computer Science

Data Representation is the foundation of computing systems and affects both hardware and software designs. It enables both logic and arithmetic operations to be performed in the binary number system , on which computers are based.

An illustrative example of the importance of data representation is when you write a text document. The characters you type are represented in ASCII code - a set of binary numbers. Each number is sent to the memory, represented as electrical signals; everything you see on your screen is a representation of the underlying binary data.

Computing operations and functions, like searching, sorting or adding, rely heavily on appropriate data representation for efficient execution. Also, computer programming languages and compilers require a deep understanding of data representation to successfully interpret and execute commands.

As technology evolves, so too does our data representation techniques. Quantum computing, for example, uses quantum bits or "qubits". A qubit can represent a 0, 1, or both at the same time, thanks to the phenomenon of quantum superposition.

Types of Data Representation

In computer systems , various types of data representation techniques are utilized:

Numbers can be represented in real, integer, and rational formats. Text is represented by using different types of encodings, such as ASCII or Unicode. Images can be represented in various formats like JPG, PNG, or GIF, each having its specific rendering algorithm and compression techniques.

Tables are another important way of data representation, especially in the realm of databases .

NameEmail
John Doe[email protected]
Jane Doe[email protected]

This approach is particularly effective in storing structured data, making information readily accessible and easy to handle. By understanding the principles of data representation, you can better appreciate the complexity and sophistication behind our everyday interactions with technology.

Data Representation and Interpretation

To delve deeper into the world of Computer Science, it is essential to study the intricacies of data representation and interpretation. While data representation is about the techniques through which data are expressed or encoded in a computer system, data interpretation refers to the computing machines' ability to understand and work with these encoded data.

Basics of Data Representation and Interpretation

The core of data representation and interpretation is founded on the binary system. Represented by 0s and 1s, the binary system signifies the 'off' and 'on' states of electric current, seamlessly translating them into a language comprehensible to computing hardware.

For instance, \[ 1101 \, \text{in binary is equivalent to} \, 13 \, \text{in decimal} \] This interpretation happens consistently in the background during all of your interactions with a computer system.

Now, try imagining a vast array of these binary numbers. It could get overwhelming swiftly. To bring order and efficiency to this chaos, binary digits (or bits) are grouped into larger sets like bytes, kilobytes, and so on. A single byte , the most commonly used set, contains eight bits. Here's a simplified representation of how bits are grouped:

  • 1 bit = Binary Digit
  • 8 bits = 1 byte
  • 1024 bytes = 1 kilobyte (KB)
  • 1024 KB = 1 megabyte (MB)
  • 1024 MB = 1 gigabyte (GB)
  • 1024 GB = 1 terabyte (TB)

However, the binary system isn't the only number system pivotal for data interpretation. Both decimal (base 10) and hexadecimal (base 16) systems play significant roles in processing numbers and text data. Moreover, translating human-readable language into computer interpretable format involves character encodings like ASCII (American Standard Code for Information Interchange) and Unicode.

These systems interpret alphabetic characters, numerals, punctuation marks, and other common symbols into binary code. For example, the ASCII value for capital 'A' is 65, which corresponds to \(01000001\) in binary.

In the world of images, different encoding schemes interpret pixel data. JPG, PNG, and GIF, being common examples of such encoded formats. Similarly, audio files utilise encoding formats like MP3 and WAV to store sound data.

Importance of Data Interpretation in Computer Science

Understanding data interpretation in computer science is integral to unlocking the potential of any computing process or system. When coded data is input into a system, your computer must interpret this data accurately to make it usable.

Consider typing a document in a word processor like Microsoft Word. As you type, each keystroke is converted to an ASCII code by your keyboard. Stored as binary, these codes are transmitted to the active word processing software. The word processor interprets these codes back into alphabetic characters, enabling the correct letters to appear on your screen, as per your keystrokes.

Data interpretation is not just an isolated occurrence, but a recurring necessity - needed every time a computing process must deal with data. This is no different when you're watching a video, browsing a website, or even when the computer boots up.

Rendering images and videos is an ideal illustration of the importance of data interpretation.

Digital photos and videos are composed of tiny dots, or pixels, each encoded with specific numbers to denote colour composition and intensity. Every time you view a photo or play a video, your computer interprets the underlying data and reassembles the pixels to form a comprehensible image or video sequence on your screen.

Data interpretation further extends to more complex territories like facial recognition, bioinformatics, data mining, and even artificial intelligence. In these applications, data from various sources is collected, converted into machine-acceptable format, processed, and interpreted to provide meaningful outputs.

In summary, data interpretation is vital for the functionality, efficiency, and progress of computer systems and the services they provide. Understanding the basics of data representation and interpretation, thereby, forms the backbone of computer science studies.

Delving into Binary Data Representation

Binary data representation is the most fundamental and elementary form of data representation in computing systems. At the lowermost level, every piece of information processed by a computer is converted into a binary format.

Understanding Binary Data Representation

Binary data representation is based on the binary numeral system. This system, also known as the base-2 system, uses only two digits - 0 and 1 to represent all kinds of data. The concept dates back to the early 18th-century mathematics and has since found its place as the bedrock of modern computers. In computing, the binary system's digits are called bits (short for 'binary digit'), and they are the smallest indivisible unit of data.

Each bit can be in one of two states representing 0 ('off') or 1 ('on'). Formally, the binary number \( b_n b_{n-1} ... b_2 b_1 b_0 \), is interpreted using the formula: \[ B = b_n \times 2^n + b_{n-1} \times 2^{n-1} + ... + b_2 \times 2^2 + b_1 \times 2^1 + b_0 \times 2^0 \] Where \( b_i \) are the binary digits and \( B \) is the corresponding decimal number.

For example, for the binary number 1011, the process will look like this: \[ B = 1*2^3 + 0*2^2 + 1*2^1 + 1*2^0 \]

This mathematical translation makes it possible for computing machines to perform complex operations even though they understand only the simple language of 'on' and 'off' signals.

When representing character data, computing systems use binary-encoded formats. ASCII and Unicode are common examples. In ASCII, each character is assigned a unique 7-bit binary code. For example, the binary representation for the uppercase letter 'A' is 0100001. Interpreting such encoded data back to a human-readable format is a core responsibility of computing systems and forms the basis for the exchange of digital information globally.

Practical Application of Binary Data Representation

Binary data representation is used across every single aspect of digital computing. From simple calculations performed by a digital calculator to the complex animations rendered in a high-definition video game, binary data representation is at play in the background.

Consider a simple calculation like 7+5. When you input this into a digital calculator, the numbers and the operation get converted into their binary equivalents. The microcontroller inside the calculator processes these binary inputs, performs the sum operation in binary, and finally, returns the result as a binary output. This binary output is then converted back into a decimal number which you see displayed on the calculator screen.

When it comes to text files, every character typed into the document is converted to its binary equivalent using a character encoding system, typically ASCII or Unicode. It is then saved onto your storage device as a sequence of binary digits.

Similarly, for image files, every pixel is represented as a binary number. Each binary number, called a 'bit map', specifies the colour and intensity of each pixel. When you open the image file, the computer reads the binary data and presents it on your screen as a colourful, coherent image. The concept extends even further into the internet and network communications, data encryption , data compression , and more.

When you are downloading a file over the internet, it is sent to your system as a stream of binary data. The web browser on your system receives this data, recognizes the type of file and accordingly interprets the binary data back into the intended format.

In essence, every operation that you can perform on a computer system, no matter how simple or complex, essentially boils down to large-scale manipulation of binary data. And that sums up the practical application and universal significance of binary data representation in digital computing.

Binary Tree Representation in Data Structures

Binary trees occupy a central position in data structures , especially in algorithms and database designs. As a non-linear data structure, a binary tree is essentially a tree-like model where each node has a maximum of two children, often distinguished as 'left child' and 'right child'.

Fundamentals of Binary Tree Representation

A binary tree is a tree data structure where each parent node has no more than two children, typically referred to as the left child and the right child. Each node in the binary tree contains:

  • A data element
  • Pointer or link to the left child
  • Pointer or link to the right child

The topmost node of the tree is known as the root. The nodes without any children, usually dwelling at the tree's last level, are known as leaf nodes or external nodes. Binary trees are fundamentally differentiated by their properties and the relationships among the elements. Some types include:

  • Full Binary Tree: A binary tree where every node has 0 or 2 children.
  • Complete Binary Tree: A binary tree where all levels are completely filled except possibly the last level, which is filled from left to right.
  • Perfect Binary Tree: A binary tree where all internal nodes have two children and all leaves are at the same level.
  • Skewed Binary Tree: A binary tree where every node has only left child or only right child.

In a binary tree, the maximum number of nodes \( N \) at any level \( L \) can be calculated using the formula \( N = 2^{L-1} \). Conversely, for a tree with \( N \) nodes, the maximum height or maximum number of levels is \( \lceil Log_2(N+1) \rceil \).

Binary tree representation employs arrays and linked lists. Sometimes, an implicit array-based representation suffices, especially for complete binary trees. The root is stored at index 0, while for each node at index \( i \), the left child is stored at index \( 2i + 1 \), and the right child at \( 2i + 2 \).

However, the most common representation is the linked-node representation that utilises a node-based structure. Each node in the binary tree is a data structure that contains a data field and two pointers pointing to its left and right child nodes.

Usage of Binary Tree in Data Structures

Binary trees are typically used for expressing hierarchical relationships, and thus find application across various areas in computer science. In mathematical applications, binary trees are ideal for expressing certain elements' relationships.

For example, binary trees are used to represent expressions in arithmetic and Boolean algebra.

Consider an arithmetic expression like (4 + 5) * 6. This can be represented using a binary tree where the operators are parent nodes, and the operands are children. The expression gets evaluated by performing operations in a specific tree traversal order.

Among the more complex usages, binary search trees — a variant of binary trees — are employed in database engines and file systems .

  • Binary Heaps, a type of binary tree, are used as an efficient priority queue in many algorithms like Dijkstra's algorithm and the Heap Sort algorithm.
  • Binary trees are also used in creating binary space partition trees, which are used for quickly finding objects in games and 3D computer graphics.
  • Syntax trees used in compilers are a direct application of binary trees. They help translate high-level language expressions into machine code.
  • Huffman Coding Trees, which are used in data compression algorithms, are another variant of binary trees.

The theoretical underpinnings of all these binary tree applications are the traversal methods and operations, such as insertion and deletion, which are intrinsic to the data structure.

Binary trees are also used in advanced machine-learning algorithms. Decision Tree is a type of binary tree that uses a tree-like model of decisions. It is one of the most successful forms of supervised learning algorithms in data mining and machine learning.

The advantages of a binary tree lie in their efficient organisation and quick data access, making them a cornerstone of many complex data structures and algorithms. Understanding the workings and fundamentals of binary tree representation will equip you with a stronger pillaring in the world of data structures and computer science in general.

Grasping Data Model Representation

When dealing with vast amounts of data, organising and understanding the relationships between different pieces of data is of utmost importance. This is where data model representation comes into play in computer science. A data model provides an abstract, simplified view of real-world data. It defines the data elements and the relationships among them, providing an organised and consistent representation of data.

Exploring Different Types of Data Models

Understanding the intricacies of data models will equip you with a solid foundation in making sense of complex data relationships. Some of the most commonly used data models include:

  • Hierarchical Model
  • Network Model
  • Relational Model
  • Entity-Relationship Model
  • Object-Oriented Model
  • Semantic Model

The Hierarchical Model presents data in a tree-like structure, where each record has one parent record and many children. This model is largely applied in file systems and XML documents. The limitations are that this model does not allow a child to have multiple parents, thus limiting its real-world applications.

The Network Model, an enhancement of the hierarchical model, allows a child node to have multiple parent nodes, resulting in a graph structure. This model is suitable for representing complex relationships but comes with its own challenges such as iteration and navigation, which can be intricate.

The Relational Model, created by E.F. Codd, uses a tabular structure to depict data and their relationships. Each row represents a collection of related data values, and each column represents a particular attribute. This is the most widely used model due to its simplicity and flexibility.

The Entity-Relationship Model illustrates the conceptual view of a database. It uses three basic concepts: Entities, Attributes (the properties of these entities), and Relationships among entities. This model is most commonly used in database design .

The Object-Oriented Model goes a step further and adds methods (functions) to the entities besides attributes. This data model integrates the data and the operations applicable to the data into a single component known as an object. Such an approach enables encapsulation, a significant characteristic of object-oriented programming.

The Semantic Model aims to capture more meaning of data by defining the nature of data and the relationships that exist between them. This model is beneficial in representing complex data interrelations and is used in expert systems and artificial intelligence fields.

The Role of Data Models in Data Representation

Data models provide a method for the efficient representation and interaction of data elements, thus forming an integral part of any database system. They provide the theoretical foundation for designing databases, thereby playing an essential role in the development of applications.

A data model is a set of concepts and rules for formally describing and representing real-world data. It serves as a blueprint for designing and implementing databases and assists communication between system developers and end-users.

Databases serve as vast repositories, storing a plethora of data. Such vast data needs effective organisation and management for optimal access and usage. Here, data models come into play, providing a structural view of data, thereby enabling the efficient organisation, storage and retrieval of data.

Consider a library system. The system needs to record data about books, authors, publishers, members, and loans. All these items represent different entities. Relationships exist between these entities. For example, a book is published by a publisher, an author writes a book, or a member borrows a book. Using an Entity-Relationship Model, we can effectively represent all these entities and relationships, aiding the library system's development process.

Designing such a model requires careful consideration of what data is required to be stored and how different data elements relate to each other. Depending on their specific requirements, database developers can select the most suitable data model representation. This choice can significantly affect the functionality, performance, and scalability of the resulting databases.

From decision-support systems and expert systems to distributed databases and data warehouses, data models find a place in various applications.

Modern NoSQL databases often use several models simultaneously to meet their needs. For example, a document-based model for unstructured data and a column-based model for analyzing large data sets. In this way, data models continue to evolve and adapt to the burgeoning needs of the digital world.

Therefore, acquiring a strong understanding of data model representations and their roles forms an integral part of the database management and design process. It empowers you with the ability to handle large volumes of diverse data efficiently and effectively.

Data Representation - Key takeaways

  • Data representation refers to techniques used to express information in computer systems, encompassing text, numbers, images, audio, and more.
  • Data Representation is about how computers interpret and function with different information types, including binary systems, bits and bytes, number systems (decimal, hexadecimal) and character encoding (ASCII, Unicode).
  • Binary Data Representation is the conversion of all kinds of information processed by a computer into binary format.
  • Express hierarchical relationships across various areas in computer science.
  • Represent relationships in mathematical applications, used in database engines, file systems, and priority queues in algorithms.
  • Data Model Representation is an abstract, simplified view of real-world data that defines the data elements, and their relationships and provides a consistently organised way of representing data.

Flashcards in Data Representation in Computer Science 441

Data representation in computer science refers to the methods used to express information in a computer system. It's how a computer interprets and functions with different information types, ranging from text and numbers to images, audio, and beyond.

When dealing with data representation, one should understand the binary system, bits and bytes, number systems like decimal and hexadecimal, and character encoding such as ASCII and Unicode.

Data representation forms the foundation of computer systems and affects hardware and software designs. It enables logic and arithmetic operations to be performed in the binary number system, and is integral to computer programming languages and compilers.

The core of data representation in computer systems is based on the binary system, which uses 0s and 1s, representing 'off' and 'on' states of electric current. These translate into a language that computer hardware can understand.

Binary digits or bits are grouped into larger sets like bytes, kilobytes, MB, GB and TB. For instance, 8 bits make up a byte, and 1024 bytes make up a kilobyte.

Data interpretation is vital as it allows coded data to be accurately translated into a usable format for any computer process or system. It is a recurring necessity whenever a computing process has to deal with data.

Data Representation in Computer Science

Learn with 441 Data Representation in Computer Science flashcards in the free StudySmarter app

We have 14,000 flashcards about Dynamic Landscapes.

Already have an account? Log in

Frequently Asked Questions about Data Representation in Computer Science

What is data representation?

Data representation is the method used to encode information into a format that can be used and understood by computer systems. It involves the conversion of real-world data, such as text, images, sounds, numbers, into forms like binary or hexadecimal which computers can process. The choice of representation can affect the quality, accuracy and efficiency of data processing. Precisely, it's how computer systems interpret and manipulate data.

What does data representation mean?

Data representation refers to the methods or techniques used to express, display or encode data in a readable format for a computer or a user. This could be in different forms such as binary, decimal, or alphabetic forms. It's crucial in computer science since it links the abstract world of thought and concept to the concrete domain of signals, signs and symbols. It forms the basis of information processing and storage in contemporary digital computing systems.

Why is data representation important?

Data representation is crucial as it allows information to be processed, transferred, and interpreted in a meaningful way. It helps in organising and analysing data effectively, providing insights for decision-making processes. Moreover, it facilitates communication between the computer system and the real world, enabling computing outcomes to be understood by users. Finally, accurate data representation ensures integrity and reliability of the data, which is vital for effective problem solving.

How to make a graphical representation of data?

To create a graphical representation of data, first collect and organise your data. Choose a suitable form of data representation such as bar graphs, pie charts, line graphs, or histograms depending on the type of data and the information you want to display. Use a data visualisation tool or software such as Excel or Tableau to help you generate the graph. Always remember to label your axes and provide a title and legend if necessary.

What is data representation in statistics?

Data representation in statistics refers to the various methods used to display or present data in meaningful ways. This often includes the use of graphs, charts, tables, histograms or other visual tools that can help in the interpretation and analysis of data. It enables efficient communication of information and helps in drawing statistical conclusions. Essentially, it's a way of providing a visual context to complex datasets, making the data easily understandable.

Test your knowledge with multiple choice flashcards

Data Representation in Computer Science

Join the StudySmarter App and learn efficiently with millions of flashcards and more!

Keep learning, you are doing great.

Discover learning materials with the free StudySmarter app

1

About StudySmarter

StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.

Data Representation in Computer Science

StudySmarter Editorial Team

Team Computer Science Teachers

  • 19 minutes reading time
  • Checked by StudySmarter Editorial Team

Study anywhere. Anytime.Across all devices.

Create a free account to save this explanation..

Save explanations to your personalised space and access them anytime, anywhere!

By signing up, you agree to the Terms and Conditions and the Privacy Policy of StudySmarter.

Sign up to highlight and take notes. It’s 100% free.

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Smart Note-Taking

Join over 22 million students in learning with our StudySmarter App

  • Number System and Arithmetic
  • Trigonometry
  • Probability
  • Mensuration
  • Linear Algebra
  • CBSE Class 8 Maths Formulas
  • CBSE Class 9 Maths Formulas
  • CBSE Class 10 Maths Formulas
  • CBSE Class 11 Maths Formulas

What are the different ways of Data Representation?

The process of collecting the data and analyzing that data in large quantity is known as statistics. It is a branch of mathematics trading with the collection, analysis, interpretation, and presentation of numeral facts and figures.

It is a numerical statement that helps us to collect and analyze the data in large quantity the statistics are based on two of its concepts:

  • Statistical Data 
  • Statistical Science

Statistics must be expressed numerically and should be collected systematically.

Data Representation

The word data refers to constituting people, things, events, ideas. It can be a title, an integer, or anycast.  After collecting data the investigator has to condense them in tabular form to study their salient features. Such an arrangement is known as the presentation of data.

It refers to the process of condensing the collected data in a tabular form or graphically. This arrangement of data is known as Data Representation.

The row can be placed in different orders like it can be presented in ascending orders, descending order, or can be presented in alphabetical order. 

Example: Let the marks obtained by 10 students of class V in a class test, out of 50 according to their roll numbers, be: 39, 44, 49, 40, 22, 10, 45, 38, 15, 50 The data in the given form is known as raw data. The above given data can be placed in the serial order as shown below: Roll No. Marks 1 39 2 44 3 49 4 40 5 22 6 10 7 45 8 38 9 14 10 50 Now, if you want to analyse the standard of achievement of the students. If you arrange them in ascending or descending order, it will give you a better picture. Ascending order: 10, 15, 22, 38, 39, 40, 44. 45, 49, 50 Descending order: 50, 49, 45, 44, 40, 39, 38, 22, 15, 10 When the row is placed in ascending or descending order is known as arrayed data.

Types of Graphical Data Representation

Bar chart helps us to represent the collected data visually. The collected data can be visualized horizontally or vertically in a bar chart like amounts and frequency. It can be grouped or single. It helps us in comparing different items. By looking at all the bars, it is easy to say which types in a group of data influence the other.

Now let us understand bar chart by taking this example  Let the marks obtained by 5 students of class V in a class test, out of 10 according to their names, be: 7,8,4,9,6 The data in the given form is known as raw data. The above given data can be placed in the bar chart as shown below: Name Marks Akshay 7 Maya 8 Dhanvi 4 Jaslen 9 Muskan 6

A histogram is the graphical representation of data. It is similar to the appearance of a bar graph but there is a lot of difference between histogram and bar graph because a bar graph helps to measure the frequency of categorical data. A categorical data means it is based on two or more categories like gender, months, etc. Whereas histogram is used for quantitative data.

For example:

The graph which uses lines and points to present the change in time is known as a line graph. Line graphs can be based on the number of animals left on earth, the increasing population of the world day by day, or the increasing or decreasing the number of bitcoins day by day, etc. The line graphs tell us about the changes occurring across the world over time. In a  line graph, we can tell about two or more types of changes occurring around the world.

For Example:

Pie chart is a type of graph that involves a structural graphic representation of numerical proportion. It can be replaced in most cases by other plots like a bar chart, box plot, dot plot, etc. As per the research, it is shown that it is difficult to compare the different sections of a given pie chart, or if it is to compare data across different pie charts.

Frequency Distribution Table

A frequency distribution table is a chart that helps us to summarise the value and the frequency of the chart. This frequency distribution table has two columns, The first column consist of the list of the various outcome in the data, While the second column list the frequency of each outcome of the data. By putting this kind of data into a table it helps us to make it easier to understand and analyze the data. 

For Example: To create a frequency distribution table, we would first need to list all the outcomes in the data. In this example, the results are 0 runs, 1 run, 2 runs, and 3 runs. We would list these numerals in numerical ranking in the foremost queue. Subsequently, we ought to calculate how many times per result happened. They scored 0 runs in the 1st, 4th, 7th, and 8th innings, 1 run in the 2nd, 5th, and the 9th innings, 2 runs in the 6th inning, and 3 runs in the 3rd inning. We set the frequency of each result in the double queue. You can notice that the table is a vastly more useful method to show this data.  Baseball Team Runs Per Inning Number of Runs Frequency           0       4           1        3            2        1            3        1

Sample Questions

Question 1: Considering the school fee submission of 10 students of class 10th is given below:

Muskan  Paid
Kritika Not paid
Anmol Not paid
Raghav Paid
Nitin Paid
Dhanvi Paid
Jasleen Paid
Manas Not paid
Anshul Not paid
Sahil Paid
In order to draw the bar graph for the data above, we prepare the frequency table as given below. Fee submission No. of Students Paid   6 Not paid    4 Now we have to represent the data by using the bar graph. It can be drawn by following the steps given below: Step 1: firstly we have to draw the two axis of the graph X-axis and the Y-axis. The varieties of the data must be put on the X-axis (the horizontal line) and the frequencies of the data must be put on the Y-axis (the vertical line) of the graph. Step 2: After drawing both the axis now we have to give the numeric scale to the Y-axis (the vertical line) of the graph It should be started from zero and ends up with the highest value of the data. Step 3: After the decision of the range at the Y-axis now we have to give it a suitable difference of the numeric scale. Like it can be 0,1,2,3…….or 0,10,20,30 either we can give it a numeric scale like 0,20,40,60… Step 4: Now on the X-axis we have to label it appropriately. Step 5: Now we have to draw the bars according to the data but we have to keep in mind that all the bars should be of the same length and there should be the same distance between each graph

Question 2: Watch the subsequent pie chart that denotes the money spent by Megha at the funfair. The suggested colour indicates the quantity paid for each variety. The total value of the data is 15 and the amount paid on each variety is diagnosed as follows:

Chocolates – 3

Wafers – 3

Toys – 2

Rides – 7

To convert this into pie chart percentage, we apply the formula:  (Frequency/Total Frequency) × 100 Let us convert the above data into a percentage: Amount paid on rides: (7/15) × 100 = 47% Amount paid on toys: (2/15) × 100 = 13% Amount paid on wafers: (3/15) × 100 = 20% Amount paid on chocolates: (3/15) × 100 = 20 %

Question 3: The line graph given below shows how Devdas’s height changes as he grows.

Given below is a line graph showing the height changes in Devdas’s as he grows. Observe the graph and answer the questions below.

data representation in computer studies

(i) What was the height of  Devdas’s at 8 years? Answer: 65 inches (ii) What was the height of  Devdas’s at 6 years? Answer:  50 inches (iii) What was the height of  Devdas’s at 2 years? Answer: 35 inches (iv) How much has  Devdas’s grown from 2 to 8 years? Answer: 30 inches (v) When was  Devdas’s 35 inches tall? Answer: 2 years.

Please Login to comment...

Similar reads.

  • Mathematics
  • School Learning
  • 105 Funny Things to Do to Make Someone Laugh
  • Best PS5 SSDs in 2024: Top Picks for Expanding Your Storage
  • Best Nintendo Switch Controllers in 2024
  • Xbox Game Pass Ultimate: Features, Benefits, and Pricing in 2024
  • #geekstreak2024 – 21 Days POTD Challenge Powered By Deutsche Bank

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

CHEPLASKEI BOYS HIGH SCHOOL

Data representation in a computer, 1. introduction.

  • Computers are classified according to functionality, physical size and purpose .
  • Functionality, Computers could be analog, digital or hybrid . Digital computers process data that is in discrete form whereas analog computers process data that is continuous in nature. Hybrid computers on the other hand can process data that is both discrete and continuous.
  • In digital computers, the user input is first converted and transmitted as electrical pulses that can be represented by two unique states ON and OFF. The ON state may be represented by a “1” and the off state by a “0”.The sequence of ON’S and OFF’S forms the electrical signals that the computer can understand.
  • A digital signal rises suddenly to a peak voltage of +1 for some time then suddenly drops -1 level on the other hand an analog signal rises to +1 and then drops to -1 in a continuous version.
  • Although the two graphs look different in their appearance, notice that they repeat themselves at equal time intervals. Electrical signals or waveforms of this nature are said to be periodic.Generally,a periodic wave representing a signal can be described using the following parameters
  • Amplitude(A)
  • Frequency(f)
  • periodic time(T)
  • Amplitude (A): this is the maximum displacement that the waveform of an electric signal can attain.
  • Frequency (f): is the number of cycles made by a signal in one second. It is measured in hertz.1hert is equivalent to 1 cycle/second.
  • Periodic time (T): the time taken by a signal to complete one cycle is called periodic time. Periodic time is given by the formula T=1/f, where f is the frequency of the wave.
  • When a digital signal is to be sent over analog telephone lines e.g. e-mail, it has to be converted to analog signal. This is done by connecting a device called a modem to the digital computer. This process of converting a digital signal to an analog signal is known as modulation . On the receiving end, the incoming analog signal is converted back to digital form in a process known as demodulation .

2. concepts of data representation in digital computers

  • Data and instructions cannot be entered and processed directly into computers using human language. Any type of data be it numbers, letters, special symbols, sound or pictures must first be converted into machine-readable form i.e. binary form. Due to this reason, it is important to understand how a computer together with its peripheral devices handles data in its electronic circuits, on magnetic media and in optical devices.

Data representation in digital circuits

  • Electronic components, such as microprocessor, are made up of millions of electronic circuits. The availability of high voltage(on) in these circuits is interpreted as ‘1’ while a low voltage (off) is interpreted as ‘0’.This concept can be compared to switching on and off an electric circuit. When the switch is closed the high voltage in the circuit causes the bulb to light (‘1’ state).on the other hand when the switch is open, the bulb goes off (‘0’ state). This forms a basis for describing data representation in digital computers using the binary number system.

Data representation on magnetic media

  • The laser beam reflected from the land is interpreted, as 1.The laser entering the pot is not reflected. This is interpreted as 0.The reflected pattern of light from the rotating disk falls on a receiving photoelectric detector that transforms the patterns into digital form.The presence of a magnetic field in one direction on magnetic media is interpreted as 1; while the field in the opposite direction is interpreted as “0”.Magnetic technology is mostly used on storage devices that are coated with special magnetic materials such as iron oxide. Data is written on the media by arranging the magnetic dipoles of some iron oxide particles to face in the same direction and some others in the opposite direction

Data representation on optical media

In optical devices, the presence of light is interpreted as ‘1’ while its absence is interpreted as ‘0’.Optical devices use this technology to read or store data. Take example of a CD-ROM, if the shiny surface is placed under a powerful microscope, the surface is observed to have very tiny holes called pits . The areas that do not have pits are called land .

Reason for use of binary system in computers

  • It has proved difficult to develop devices that can understand natural language directly due to the complexity of natural languages. However, it is easier to construct electric circuits based on the binary or ON and OFF logic. All forms of data can be represented in binary system format. Other reasons for the use of binary are that digital devices are more reliable, small and use less energy as compared to analog devices.

Bits, bytes, nibble and word

  • The terms bits, bytes, nibble and word are used widely in reference to computer memory and data size.
  • Bits: can be defined as either a binary, which can be 0, or 1.It is the basic unit of data or information in digital computers.
  • Byte: a group of bits (8 bits) used to represent a character. A byte is considered as the basic unit of measuring memory size in computer.
  • A nibble : is half a byte, which is usually a grouping of 4 bytes.
  • Word: two or more bits make a word. The term word length is used as the measure of the number of bits in each word. For example, a word can have a length of 16 bits, 32 bits, 64 bits etc.

Types of data representation

  • Computers not only process numbers, letters and special symbols but also complex types of data such as sound and pictures. However, these complex types of data take a lot of memory and processor time when coded in binary form.
  • This limitation necessitates the need to develop better ways of handling long streams of binary digits.

Number systems and their representation

  • A number system is a set of symbols used to represent values derived from a common base or radix.
  • As far as computers are concerned, number systems can be classified into two major categories:
  • decimal number system
  • binary number system
  • octal number system
  • hexadecimal number system

Decimal number system

  • The term decimal is derived from a Latin prefix deci, which means ten. Decimal number system has ten digits ranging from 0-9. Because this system has ten digits; it is also called a base ten number system or denary number system.
  • A decimal number should always be written with a subscript 10 e.g. X 10
  • But since this is the most widely used number system in the world, the subscript is usually understood and ignored in written work. However ,when many number systems are considered together, the subscript must always be put so as to differentiate the number systems.
  • The magnitude of a number can be considered using these parameters.
  • Absolute value
  • Place value or positional value
  • The absolute value is the magnitude of a digit in a number. for example the digit 5 in 7458 has an absolute value of 5 according to its value in the number line.
  • The place value of a digit in a number refers to the position of the digit in that number i.e. whether; tens, hundreds, thousands etc.
  • The total value of a number is the sum of the place value of each digit making the number.
  • The base value of a number also k known as the radix , depends on the type of the number systems that is being used .The value of any number depends on the radix. for example the number 100 10 is not equivalent to 100 2 .

Binary number system

Octal number system

Consists of eight digits ranging from 0-7.the place value of octal numbers goes up in factors of eight from right to left.

Hexadecimal number system This is a base 16 number system that consists of sixteen digits ranging from 0-9 and letters A-F where A is equivalent to 10,B to 11 up to F which is equivalent to 15 in base ten system. The place value of hexadecimal numbers goes up in factors of sixteen.

  • A hexadecimal number can be denoted using 16 as a subscript or capital letter H to the right of the number .For example, 94B can be written as 94B16 or 94BH.

Further conversion of numbers from one number system to another

  • To convert numbers from one system to another. the following conversions will be considered.
  • Converting between binary and decimal numbers.
  • Converting octal numbers to decimal and binary form.
  • Converting hexadecimal numbers to decimal and binary form.
  • a) Conversion between binary and decimal number
  • Converting binary numbers to decimal numbers
  • To convert a binary number to a decimal number, we proceed as follows:

First, write the place values starting from the right hand side.

  • Write each digit under its place value.
  • Multiply each digit by its corresponding place value.
  • Add up the products. The answer will be the decimal number in base ten.

Convert 101101 2 to base 10(or decimal) number

Place value

2

2

2

2

2

2

Binary digits

1

0

1

1

0

1

Multiply each digit by its place value

N =(1*2 ) +(0*2 )+(1*2 )+(1*2 )+(0*2 )+(1*2 )

N =32+0+8+4+0+1

=45

8*1=8 4*1=4

NB: remember to indicate the base subscript since it is the value that distinguishes the different systems.

  • The binary equivalent of the fractional part is extracted from the products by reading the respective integral digits from the top downwards as shown by the arrow next page.
  • Combine the two parts together to set the binary equivalent.

Convert 0.375 10 into binary form

Read this digits

0.375×2=0.750

0.750×2=1.500

0.500×2=1.000 (fraction becomes zero)

Therefore 0.375 10 =0.011 2

NB: When converting a real number from binary to decimal, work out the integral part and the fractional parts separately then combine them.

Convert 11.011 2 to a decimal number.

Convert the integral and the fractional parts separately then add them up.

1×1= +1.000

Weight

2

2

2

2

2

Binary digit

1

1

0

1

1

Values in base 10

2

1

0

0.25

0.125

0.50×0 =0.000

0.25×1 =0.250

0.125×1=+0.125

3.000 10 +0.375 10 = 3.375 10

Thus 11.011 2 =3.375 10

  • iv) Converting a decimal fraction to binary

Divide the integral part continuously by 2.For the fractional part, proceed as follows:

Multiply the fractional part by 2 and note down the product

  • Take the fractional part of the immediate product and multiply it by 2 again.
  • Continue this process until the fractional part of the subsequent product is 0 or starts to repeat itself.

Convert octal number 321 8 to its binary equivalent

Working from left to the right, each octal number is represented using three digits and then combined we get the final binary equivalent. Therefore:

Combining the three from left to right

3

2

1

011

010

001

321 8 =011010001 2

Converting binary numbers to hexadecimal numbers

To convert binary numbers to their binary equivalents, simply group the digits of the binary number into groups of four from right to left e.g. 11010001.The next step is to write the hexadecimal equivalent of each group e.g.

The equivalent of 11010001 is D1H or D1 16

Converting hexadecimal numbers to decimal and binary numbers .

Converting hexadecimal numbers to decimal number

To convert hexadecimal number to base 10 equivalent we proceed as follows:

  • If a digit is a letter such as ‘A’ write its decimal equivalent
  • Multiply each hexadecimal digit with its corresponding place value and then add the products

The binary equivalent of the fractional part is extracted from the products by reading the respective integral digits from the top downwards as shown by the arrow next pag

Converting octal numbers to decimal and binary numbers

Converting octal numbers to decimal numbers

  • To convert a base 8 number to its decimal equivalent we use the same method as we did with binary numbers. However, it is important to note that the maximum absolute value of a octal digit is 7.For example 982 Is not a valid octal number because digit 9 is not an octal digit, but 736 8 is valid because all the digits are in the range 0-7.Example shows how to convert an octal number to a decimal number.

Example 1.13

Convert 512 8 to its base 10 equivalent

Place value

8

8

8

64

8

1

Octal digit

5

1

2

Write each number under its place value as shown below

Multiply each number by its place value.

N =(5 x 8 )+(1 x 8 )+(2 x 8 )

=(5 x 64)+8+2

=320+8+2

N =330

Therefore512 8 =330 10

Converting octal numbers to binary numbers

  • To convert an octal number to binary, each digit is represented by three binary digits because the maximum octal digit i.e. 7 can be represented with a maximum of seven digits. See table:

Octal digit

Binary equivalents

0

000

1

001

2

010

3

011

4

100

5

101

6

110

7

111

Convert the hexadecimal number 111 16 to its binary equivalent.

Place each number under its place value.

16

16

16

1

1

1

256 x1= 256

16 x 1 = 16

Therefore 111 16 =273 10

  • The following examples illustrate how to convert hexadecimal number to a decimal number

Convert the hexadecimal number 111 16 to its binary equivalent

Converting hexadecimal numbers to binary numbers

  • Since F is equivalent to a binary number1111 2 the hexadecimal number are therefore represented using4 digits as shown in the table below

Hexadecimal digit

Decimal equivalent

Binary equivalent

00

00

0000

01

01

0001

02

02

0010

03

03

0011

04

04

0100

05

05

0101

06

06

0110

07

07

0111

08

08

1000

09

09

1001

A

10

1010

B

11

1011

C

12

1100

D

13

1101

E

14

1110

F

15

1111

The simplest method of converting a hexadecimal number to binary is to express each hexadecimal digit as a four bit binary digit number and then arranging the group according to their corresponding positions as shown in example

Convert 321 16

Hexadecimal digit

3

2

1

Binary equivalent

0011

0010

0001

Combining the three sets of bits, we get 001100100001 2

321 16 = 001100100001 2

Convert 5E6 16 into binary

Hexadecimal digit

5

E

6

Binary equivalent

0101

1110

0110

5E616 = 010111100110 2

Symbolic representation using coding schemes

  • In computing, a single character such as a letter, a number or a symbol is represented by a group of bits. The number of bits per character depends on the coding scheme used.
  • The most common coding schemes are:
  • Binary Coded Decimal (BCD),
  • Extended Binary Coded Decimal Interchange Code (EBCDIC) and
  • American Standard Code for Information Interchange (ASCII).

Binary Coded Decimal

  • Binary Coded Decimal is a 4-bit code used to represent numeric data only. For example, a number like 9 can be represented using Binary Coded Decimal as 1001 2 .
  • Binary Coded Decimal is mostly used in simple electronic devices like calculators and microwaves. This is because it makes it easier to process and display individual numbers on their Liquid Crystal Display (LCD) screens.
  • A standard Binary Coded Decimal , an enhanced format of Binary Coded Decimal, is a 6-bit representation scheme which can represent non-numeric characters. This allows 64 characters to be represented. For letter A can be represented as 110001 2 using standard Binary Coded Decimal

Extended Binary Coded Decimal Interchange code (EBCDIC)

  • Extended Binary Coded Decimal Interchange code (EBCDIC) is an 8-bit character-coding scheme used primarily on IBM computers. A total of 256 (2 8 ) characters can be coded using this scheme. For example, the symbolic representation of letter A using Extended Binary Coded Decimal Interchange code is 11000001 2 .

American standard code for information interchange (ASCII)

  • American standard code for information interchange (ASCII) is a 7-bit code, which means that only 128 characters i.e. 2 7 can be represented. However, manufactures have added an eight bit to this coding scheme, which can now provide for 256 characters.
  • This 8-bit coding scheme is referred to as an 8-bit American standard code for information interchange. The symbolic representation of letter A using this scheme is 1000001 2 ..

Binary arithmetic operations

  • In mathematics, the four basic arithmetic operations applied on numbers are addition, subtraction, multiplications and division.
  • In computers, the same operations are performed inside the central processing unit by the arithmetic and logic unit (ALU). However, the arithmetic and logic unit cannot perform binary subtractions directly. It performs binary subtractions using a process known as For multiplication and division, the arithmetic and logic unit uses a method called shifting before adding the bits.

Representation of signed binary numbers

  • In computer technology, there are three common ways of representing a signed binary number.
  • Prefixing an extra sign bit to a binary number.
  • Using ones compliment.
  • Using twos compliment.

Prefixing an extra sign bit to a binary number

  • In decimal numbers, a signed number has a prefix “+” for a positive number e.g. +27 10 and “-“ for a negative number e.g. - 27
  • However, in binary, a negative number may be represented by prefixing a digit 1 to the number while a positive number may be represented by prefixing a digit 0. For example, the 7-bit binary equivalent of 127 is 1111111 2 . To indicate that it is positive, we add an extra bit (0) to the left of the number i.e. (0)1111111 2 .
  • To indicate that it is negative number we add an extra bit (1) i.e. (1)1111111 2 .
  • The problem of using this method is that the zero can be represented in two ways i.e.(0)0000000 2 and (1)0000000 2 .

Ones compliment

  • The term compliment refers to a part which together with another makes up a whole . For example in geometry two complimentary angle (90 0 ).
  • The idea of compliment is used to address the problem of signed numbers i.e. positive and negative.
  • In decimal numbers (0 to 9), we talk of nine’s compliment. For example the nines compliment
  • Of 9 is 0, that of 5 is 4 while 3 is 6.
  • However, in binary numbers, the ones compliment is the bitwise NOT applied to the number. Bitwise NOT is a unary operator (operation on only one operand) that performs logical negation on each bit. For example the bitwise NOT of 1100 2 is 0011 2 e.
  • 0s are negated to 1s while 1s are negated to 0s.

Twos compliment

  • Twos compliment, equivalent to tens compliment in decimal numbers, is the most popular way of representing negative numbers in computer systems. The advantages of using this method are:
  • There are no two ways of representing a zero as in the case with other two methods.
  • Effective addition and subtraction can be done even with numbers that are represented with a sign bit without a need for circuitries to examine the sign of an operand.
  • The twos compliment of a number is obtained by getting the ones compliment then adding a 1. For example, to get the twos compliment of a decimal number 45 10,
  • First convert it to its binary equivalent then find its ones compliment. Add a 1 to ones compliment i.e.

45 10 =00101101 2

Bitwise NOT (00101101) =11010010

Two’s compliment = 11010010 2 +1 2

= 11010011 2

Binary addition

The five possible additions in binary are

  • 0+ 1 2 = 1 2
  • 1 2 + 0 = 1 2
  • 1 2 + 1 2 = 10 2 (read as 0, carry 1)

Find the sum of 111 2 + 011 2

Arrange the bits vertically. 111

Working from the right to the left, we proceed as follows: + 011

Step 1 1 2 + 1 2 = 10 2 , (write down 0 and carry 1) 1010 2

Step 2 1 2 + 1 2 + 1 2 = 11 2 , (add and carry over digit to 1 + 1 in order to get 1 + 1

+1. From the sum, write down digit one the carry

Step 3 1 2 + 1 2 + 0 2 = 10 2 , (add the carry over digit to 1 + 0 in order to get

1 + 1 + 0.since this is the last step, write down 10)

Therefore 111 2 + 011 2 = 1010 2

This can be summarized in the table

1 number

1

1

1

2 number

0

1

1

Carry digit

-

1

1

sum

10

1

0

Add the following binary number

Add the first two numbers and then add the sum to the third number a follows:

Step 1 Step 2

10110 2 100001 2

+ 1011 2 + 111 2

100001 2 101000 2

Binary subtraction

Direct subtraction

The four possible subtractions in binary are:

  • 1 2 – 0 = 1 2
  • 1 2 – 1 2 = 0
  • 10 2 – 1 2 = 1 2 ( borrow 1 from the next most significant digit to make 0 become 10 2 ,

hence 10 2 – 1 2 = 1 2 )

Subraction using ones compliment

The main purpose of using ones compliment in computers is to perform binary subtraction. For example to get the difference in 5 – 3, using the ones compliment, we proceed as follows:

  • Rewrite the problem as 5 + (-3) to show that he computer binary subtraction by adding the binary equivalent of 5 to ones compliment of 3.
  • Convert the absolute value of 3 into 8-bits equivalent i.e. 00000011 2 .
  • Take the ones compliment of 00000011 2 e. 11111100 2 which is the binary representation of -3 10 .
  • Add the binary equivalent of 5 to ones compliment of 3 i.e.

(1)00000001

Subtraction using twos compliments.

Like in ones compliment, the twos compliment of a number is obtained by negating a positive number to is negative counterpart. For example to get the difference in 5-3, using twos compliment, we proceed as follow:

  • rewrite the problem as 5 + (-3)
  • Convert the absolute value of 3 into 8-bit binary equivalent i.e. 00000011.
  • Take the ones compliment of 00000011 i.e. 11111100.
  • add a 1 to the ones compliment i.e. 11111100 to get 11111101
  • add he binary equivalent of 5 to the twos compliment of 3 i.e.

(1)00000010 Ignoring the overflow bit, the resulting number is 00000010, which is directly read as a binary equivalent of +2.

Using twos compliment

31 10 - 17 10 in binary form.

17 10 in binary 00010001

1’s compliment 11101110

2’s compliment 11101111

31 10 = 00011111 2

00011111 + 11101111 = (1)00001110 2

This week: the arXiv Accessibility Forum

Help | Advanced Search

Computer Science > Computer Vision and Pattern Recognition

Title: learning brain tumor representation in 3d high-resolution mr images via interpretable state space models.

Abstract: Learning meaningful and interpretable representations from high-dimensional volumetric magnetic resonance (MR) images is essential for advancing personalized medicine. While Vision Transformers (ViTs) have shown promise in handling image data, their application to 3D multi-contrast MR images faces challenges due to computational complexity and interpretability. To address this, we propose a novel state-space-model (SSM)-based masked autoencoder which scales ViT-like models to handle high-resolution data effectively while also enhancing the interpretability of learned representations. We propose a latent-to-spatial mapping technique that enables direct visualization of how latent features correspond to specific regions in the input volumes in the context of SSM. We validate our method on two key neuro-oncology tasks: identification of isocitrate dehydrogenase mutation status and 1p/19q co-deletion classification, achieving state-of-the-art accuracy. Our results highlight the potential of SSM-based self-supervised learning to transform radiomics analysis by combining efficiency and interpretability.
Comments: The code is available at
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cite as: [cs.CV]
  (or [cs.CV] for this version)
  Focus to learn more arXiv-issued DOI via DataCite (pending registration)

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

COMMENTS

  1. Data Representation in Computer: Number Systems, Characters

    A computer uses a fixed number of bits to represent a piece of data which could be a number, a character, image, sound, video, etc. Data representation is the method used internally to represent data in a computer. Let us see how various types of data can be represented in computer memory. Before discussing data representation of numbers, let ...

  2. PDF Data Representation

    Data Representation Data Representation Eric Roberts CS 106A February 10, 2016 Claude Shannon Claude Shannon was one of the pioneers who shaped computer science in its early years. In his master's thesis, Shannon showed how it was possible to use Boolean logic and switching circuits to perform arithmetic calculations. That work led

  3. Data Representation

    We also cover the basics of digital circuits and logic gates, and explain how they are used to represent and process data in computer systems. Our guide includes real-world examples and case studies to help you master data representation principles and prepare for your computer science exams. Check out the links below:

  4. PDF Number Systems and Number Representation

    Number Representation. 1. Princeton University. Computer Science 217: Introduction to Programming Systems. Q: Why do computer programmers ... • A power programmer must know number systems and data representation to fully understand C's . primitive data types. Primitive values and. the operations on them. Agenda.

  5. PDF Data Representation

    is the smallest size group of bytes that a computer handle. All operations are conducted on a word-size chunks of bits. The number of values a particular group of bits can represent grows exponentially as the number of bits grows. A single bit has two values; two bits can represent four di erent values; three, eight values.

  6. Numbers

    Numbers - Data Representation - Computer Science Field Guide. In this section, we will look at how computers represent numbers. To begin with, we'll revise how the base 10 number system that we use every day works, and then look at binary, which is base 2. After that, we'll look at some other charactertistics of numbers that computers must deal ...

  7. Data Representation

    Introduction. Computers are machines that do stuff with information. They let you view, listen, create, and edit information in documents, images, videos, sound, spreadsheets and databases. They let you play games in simulated worlds that don't really exist except as information inside the computer's memory and displayed on the screen.

  8. Data representation

    Data representation. Computers use binary - the digits 0 and 1 - to store data. A binary digit, or bit, is the smallest unit of data in computing. It is represented by a 0 or a 1. Binary numbers are made up of binary digits (bits), eg the binary number 1001. The circuits in a computer's processor are made up of billions of transistors.

  9. Data Representation

    The Processor Fundamentals section goes further to discuss how the processor uses the fundamentals of machine language and the Communication representation section it discusses how data is transferred from one machine to another. Introducing data representation for computer science students: Our comprehensive guide covers binary, BCD, negative ...

  10. Introduction to Computer Data Representation

    Peter Fenwick. Bentham Science Publishers, Apr 28, 2014 - Computers - 267 pages. Introduction to Computer Data Representation introduces readers to the representation of data within computers. Starting from basic principles of number representation in computers, the book covers the representation of both integer and floating point numbers, and ...

  11. PDF CMSC 216 Introduction to Computer Systems Data Representation

    An 8-bit code, used now only by some IBM mainframes. UNICODE. Provides a unique number for every character. A family of encodings - 8, 16, and 32 bits per character. Allows a greater variety of characters. Able to represent virtually any character in use today in any language, and some no longer in use.

  12. Data Representation in Computer Studies

    Learn about data representation in computer studies, including binary, hexadecimal, ASCII, and more. Understand how computers store and process data efficien...

  13. PDF Number Systems and Number Representation

    Unsigned Data Types: Java vs. C Java has type • int • Can represent signed integers C has type: • signed int • Can represent signed integers • int • Same as signed int • unsigned int • Can represent only unsigned integers To understand C, must consider representation of both unsigned and signed integers 20

  14. 5 Data, Representation, and Information

    The preceding two chapters address the creation of models that capture phenomena of interest and the abstractions both for data and for computation that reduce these models to forms that can be executed by computer.We turn now to the ways computer scientists deal with information, especially in its static form as data that can be manipulated by programs.

  15. PDF Data Representation

    Data Representation • Data refers to the symbols that represent people, events, things, and ideas. Data can be a name, a number, the colors in a photograph, or the notes in a musical composition. • Data Representation refers to the form in which data is stored, processed, and transmitted. • Devices such as smartphones, iPods, and

  16. Data Representation in Computer Science

    A. Data representation in computer science refers to the methods used to express information in a computer system. It's how a computer interprets and functions with different information types, ranging from text and numbers to images, audio, and beyond. B. Data representation in computer science is about visualizing data using graphs and charts.

  17. Data (computer science)

    In computer science, data (treated as singular, plural, or as a mass noun) is any sequence of one or more symbols; datum is a single symbol of data. Data requires interpretation to become information. Digital data is data that is represented using the binary number system of ones (1) and zeros (0), instead of analog representation.

  18. PDF Chapter-3 DATA REPRESENTATION

    DATA REPRESENTATION Introduction In Digital Computer, data and instructions are stored in computer memory using binary code (or machine code) r epresented by Binary digIT's 1 and 0 called BIT's. The data may contain digits, alphabets or special character, which are converted to bits, understandable by the computer.

  19. Data Representation (Part 1)

    Data Representation (Part 1) | Number System | Class 11 Computer ScienceIn this video, you will understand the Introduction of Digital Number System and how...

  20. What are the different ways of Data Representation?

    Now we have to represent the data by using the bar graph. It can be drawn by following the steps given below: Step 1: firstly we have to draw the two axis of the graph X-axis and the Y-axis. The varieties of the data must be put on the X-axis (the horizontal line) and the frequencies of the data must be put on the Y-axis (the vertical line) of the graph.

  21. DATA REPRESENTATION IN A COMPUTER

    Bits, bytes, nibble and word. The terms bits, bytes, nibble and word are used widely in reference to computer memory and data size. Bits: can be defined as either a binary, which can be 0, or 1.It is the basic unit of data or information in digital computers. Byte: a group of bits (8 bits) used to represent a character.

  22. [2409.07746] Learning Brain Tumor Representation in 3D High-Resolution

    Learning meaningful and interpretable representations from high-dimensional volumetric magnetic resonance (MR) images is essential for advancing personalized medicine. While Vision Transformers (ViTs) have shown promise in handling image data, their application to 3D multi-contrast MR images faces challenges due to computational complexity and interpretability. To address this, we propose a ...