Data Types in C Programming

When we code in a program, we need to store certain values for latter use in the program. Such values need to be stored in the memory location. Even though memory location will have its own address, it is easy to identify them by name than their address. Hence we use variables – a named memory location to store these values. These variables can be used to get the values from the user, can be used in various calculations, or displaying some result or messages. But we cannot store all types of data in all the variables. If we define type of data that each variable can store, it adds values for a systematic programming using C. That means, it gives the systematic usage of the variables in the program and avoids any confusions and mishandling of data.

Similarly, C language revolves around functions. Even though functions are meant for performing certain task, it will have result values which need to be returned to the calling functions. This also needs memory location which is named as function name. But it cannot return any kind of value. Like variables, if we predefine the type of data that it returns, it makes the program more logical.

All these are done by using the datatypes in C. Datatypes defines the variables and functions along with the range of data stored, type of data stored and indicates how much bytes of memory are occupied. Variables are declared with their respective datatypes at the beginning of the program, before using them in the program/function. These datatypes are the reserved keywords in C like int, float, double, char etc.

Table of Contents

A variable is declared using its datatype as below :

datatype variable_name;
int intNum1; // variable with integer datatype, defines the variable
float flNum=3.14; // Variable with real number, defines and initializes the variable
char chOption; // chOption is of character type

When we declare a variable like above inside any function, it defines the variable. If we give the initial value to the variable while declaring them, then it both defines and initializes the variable. We can even declare, define and initialize the variables at different steps too. The keyword ‘extern’ is used to declare the variable in this case and it allows defining those variables anywhere in the program – that means in any of the function in the program.

#include <stdio.h> 
extern float marks1, marks2; // declare float variables

void main() {
	float marks1, marks2; //define the same float variables, since it is actually being used here

	marks1 = 67.5; // Initialize the variable
	marks2 = 88;

	printf("Marks in Subject 1 is: %f\n", marks1); // display the variable value
	printf("Marks in Subject 2 is: %f\n", marks2);
}

There are different types of datatypes.

Primitive / Basic/ Fundamental Datatype

It contains very basic types of datatypes used to define the variables and functions. This datatype is basically used to declare numbers, and characters.

Character Datatypes

This datatype is used to declare the character variables. It can hold only character values. But each character type of variable can hold only one character at a time. This is because, this datatype occupies only one byte of memory. That means it can store values from -128 to 127. It can be signed character value or unsigned character value.

char chOption; // chOption is of character type
unsigned char chOption; // chOption is of character type, but unsigned

Integer Datatypes

This datatype declares the variable as integer. It tells the compiler that the variables declared as integer can contain only digits. It cannot have any fractional numbers. It can either be positive or negative. It occupies 2 bytes (in older systems) or 4 bytes of memory. That indicates, it can store values from -231 to 231 values if size of int is 4 bytes. i.e.; values from -2³¹ , -2³¹ +1, -2³¹ +2, ….. -3, -2, -1, 0, 1, 2, 3, ….2³¹-2, 2³¹-1, 2³¹

It is declared as: int intNum1; // variable with integer datatype

Integer datatype can be signed or unsigned. Signed datatypes are normally referred as int. For unsigned datatypes, ‘unsigned’ keyword is appended before the int. Unsigned integer is also of size 2bytes or 4 bytes depending on the system, but unsigned will have values from 0 to 231 for int with size 4 bytes.

int intNum1; // this is a signed integer variable- can be positive or negative
unsigned int intNum2; // this is unsigned integer variable – can contain only positive values

Integer datatype can belong to any of 3 storage classes – short int, int and long int. All these storage classes can be signed or unsigned. Short int class is used to declare smaller range of numbers and it occupies only 2bytes of space. The int type of storage class uses 4 bytes of space and hence it can hold little bigger range of values. The long int class is used to store even bigger range of values.

Floating-Point Datatypes

These datatypes are used to store the real numbers as well as exponential numbers. It occupies 4 bytes of memory. Hence it can store values from 3.4 e-38 to 3.4 e+38. If we need to store even more range of floating numbers, then we can use double which occupies 8 byte of memory or long double which has 10bytes of memory. Float and double variables are almost same except their sizes and precisions. Float variable is of 4 bytes and has only 6 digits of precision / decimal places, whereas double is of 8 bytes and has 14 digits of decimal places.

float flAvg;
double dbl_fraction_number;
long double lgdbl_fractNum;

Void Datatype

This datatype does not contain any value. It is mainly used to declare functions that do not return any data values, or to indicate that function does not accept any arguments or to hold the address for a pointer variable. Its use on variable is very rare.

When a function without argument or return type needs to be declared, then we use void datatype to declare them. It indicates the compiler that it does have any value.

    void fnDisplayName();
    void fnGetAddress();
    int fn_FindSum(void);

When we use pointers, one may not be sure about the datatype of it at the time of declaration. But memory location for those pointers needs to be reserved before beginning of the program. In such case we declare pointers as void and allocate memory. Latter in the code we type cast the datatype to the pointer. (for more details, refer topic pointers).

void *ptr;
ptr = &intVar1;

void *ptr;
ptr = malloc (sizeof(int) * 10);

Non-primitive/ Derived/ Structured Datatype

Derived datatypes are the datatypes that are derived from primitive datatypes. These datatypes declare a variable, which contains set of similar or different datatype values bounded under one name. Hence these type of datatypes are called derived datatypes. There are mainly 4 types of derived datatypes.

Arrays

These are the named variable which contains set of similar datatype values. That means, using single variable name we can store multiple values. This is made possible by the use of indexes on the variable name. These variables can be of any primitive type.

For example,

int intNumbers [10]; // it stores 10 different integer values in intNumbers variable
unsigned int intVar [10]; // it stores 10 different unsigned integer values
float flReal [5]; // it stores 5 different real values in flReal variable
char chNames [20]; //it holds 20 different characters

Each value in these arrays are accessed by using the indexes. For example 5 elements in the array intNumbers can be accessed as intNumbers[4]. Here index starts from zero; hence 5th element is referred as index 4.

The size of array is equal to size of its datatype multiplied number of elements in it. In above example,

Size of intNumbers = sizeof(int) * 10 = 4 * 10 = 40 bytes.
Size of intVar = sizeof(unsigned int) * 10 = 4 * 10 = 40 bytes.
Size of flReal = sizeof(float) * 5 = 4 * 5 = 20 bytes.
Size of chNames = sizeof(char) * 20 = 1 * 20 = 20 bytes.

Structures

Structures are used to hold a set of similar or dissimilar variables in it. It is useful when we want to store the related information under one name.
For example, student details of a particular student can be stored in structure called student like below :

struct Student{
        int intStdId;
	char chrName[15];
	char chrAddress[25];
	int Age;
	float flAvgMarks;
	char chrGrade;
}

Here we can note that structure student has different types of variables. All these variables are related to student, and are combined into one common variable name called Student. Unlike arrays, here we can address each elements of the structure by its individual names. It can even have primitive type of variables, or derived variables – arrays, structures, unions and even pointers within it.

Here size of structure is sum of size of individual elements. In Student structure above,

Size of structure Student = size of (intStdId) + size of (chrName) +size of (chrAddress)
+ Size of (Age) +size of (flAvgMarks) +size of (chrGrade)
= sizeof (int) + (15 * sizeof (char)) + (25 * sizeof (char))
+ Size of (int) + size of (float) + size of (char)
= 4 bytes + (15 * 1byte) + (25 * 1byte) + 4 bytes +4 bytes + 1byte
= 33 bytes.

Union

This is another datatype in C, which is similar to structure. It is declared and accessed in the same way as structure. But keyword union is used to declare union type of datatype.

union Student{
	int intStdId;
	char chrName[15];
	char chrAddress[25];
	int Age;
	float flAvgMarks;
	char chrGrade;
}

The main difference between structure and union is in its memory allocation. In structure, total memory allocated is the sum of memory allocated for its individual elements. In unions it is the memory size of the element which has the highest memory allocated. In above Student union, the size of it is the size of chrAddress, as it has the maximum size.

Pointers

Pointers are the special variables used to store the address of another variable. By using pointers, the program gets the memory allocated to the variable to hold another variable. This has an advantage while accessing arrays, passing and returning multiple values to the functions, to handle strings, to handle different data structures like stacks, linked lists, binary tree, B+ tree etc. A pointer is declare in the same way as any other primitive variable, but a ‘*’ is added before the variable name to indicate that it is a pointer. Compiler will then understand that it is a pointer and it needs to be treated differently from any other variable.

int *intPtr;
float flflPtr;
int *intArrPtr [10];
char *chrName;
char *chrMonthPtr [12];

Data structures

Data structures like stack, queue, linked list etc are special type of variables, which use one or more primitive datatypes. Usually these are created using structure datatypes, but here they expands and shrinks as the data is added and removed. Hence these are also considered as another type of derived datatype.

User-Defined Datatype

Sometimes declaring variables using existing primitive or derived datatype will not give meaningful name or serve the purpose of variable or confusing. Sometimes user / developer will not be actually interested in its real datatype, rather they would like to have the meaning or purpose of it. It will be useful for them to create same category of variables again and again.

For example, suppose we want to have variables to store marks of students. Marks can be float number. Using our primitive datatype we will be declaring variables as below:

float flMarks1, flMarks2;

It indicates the compiler that they are the variables of type float. Since we have followed the naming convention, by seeing the variable name, we can understand that it contains marks and are of float type. But imagine that we are not interested in its type. In addition, we would like to have variables for marks as float throughout the program – in all the function. That means, if the program has multiple functions, then there is possibility that marks variables are declared with different datatypes in different functions. This may create bugs while assigning values or returning values from functions. Hence if we create our own datatype – marks, for creating different marks variable, then all functions and variable will be in sync.

That means, rename datatype float as marks. This is done by using typedef in C.

typedef float marks; // redefines float as marks

Now marks can be used to declare any variable as float. But to maintain the purpose of such declaration, all the marks variables are now declared as marks.

marks sub1_marks, sub2_marks;

look at the example program below to understand how it works datatype across the function. The marks is defined as new datatype outside the main function so that it can be used in all the function. Now marks acts as a global datatype for the program. No more float type is used in the program to declare any marks variable in the program.

#include <stdio.h> 
typedef float marks; // redefines float as marks

void  fnTotal (marks m1, marks m2){
	marks total_marks;

	total_marks = m1 + m2;
	printf("Total Marks is: %f\n", total_marks);
}
void main() {

	marks sub1_marks, sub2_marks;
	sub1_marks = 67.5;
	sub2_marks = 88;

	printf("Marks in Subject 1 is: %f\n", sub1_marks);
	printf("Marks in Subject 2 is: %f\n", sub2_marks);

	fnTotal (sub1_marks, sub2_marks); // calling the function
}

Enumerated Datatypes

Apart from C defined datatypes, C gives the flexibility for the user / developer to define their own datatypes. In the traditional way of declaring a variable, when we declare variable as int, float, array etc we can store only those type of data in those variables. When we declare structure or union, though it allows different types of data within it, it does not allow the flexibility to the users to have their own set of data/ values.

Suppose we have to have a datatype to define months in a year. We can declare a string array of size 12. But it does not tell what values it can have. Either we have to enter 12 months as input or we need to hard code the values for each index.

char *chrMonths[12] = {"January", "February"," March",…"December" };

char *chrMonths[12];
*chrMonths[0] = "January";
 *chrMonths[0] = " February";
 *chrMonths[0] = " March";
...	 …
*chrMonths[0] = " December ";

Here, we need to define a pointer array with character type or 2 dimensional arrays with character type. Instead of making it so complex with array, pointer and character type, if we can define the same like any other datatype, it will be easy for anyone to understand. Hence C provides another datatype called enumerated datatype- enum. It can be considered as user-defined datatype too. It is declared and defined as shown below :

enum enum_datatype { value1, value2, value3, valueN };

Here enum_ datatype is an enumerated datatype name and it can have values value1, value2,…valueN. Now we can use enum_datatype to declare other variables, which can take only those values that are defined in enum_datatype.

enum enum_datatype ed1, ed2, ed3;

For example, consider below enumerated datatype enumMonths.

enum enumMonths{January, February, March, .., December };
enum enumMonths monthJan, monthFeb, monthMar, monthDec;

monthJan = January;
monthFeb = February;
monthDec = December;

Here enumMonths is used to define the months in a year. When we define a enumerated datatype, we define its values too. Latter we can create variables using new datatype enumMonths as monthJan, monthFeb, monthMar, monthDec etc. These new datatypes can have any one of those values that are listed while creating the datatype. We can note that we have not assigned January, February etc to the variables using quotes. Values for these variables are assigned directly from the enumerated list as if they are also another variable. But actually what it does is, it considers the predefined January, February, March etc as indexes for the enumerated datatype. That means it considers enumMonths as array of 12 indexes from 0,1,…11. When we declare a variable as enumMonths, then it considers each variable as its one of the element – monthJan, monthFeb, monthMar are elements of enumMonths. Hence it can have any of the values from the predefined list which indicates the index for the element.

#include <stdio.h> 

void main() {
	enum enumMonths{ January, February, March, December }; // Defining enumerated Datatype
	enum enum_datatype monthJan, monthFeb, monthMar, monthDec; // Declaring variable of type enumMonths

	// Assigning the values to the variables
	monthJan = January;
	monthFeb = February;
	monthDec = December;

	// Displaying the values
	printf("Value of monthJan is %d\n ", monthJan);
	printf("Value of monthFeb is %d\n ", monthFeb);
	printf("Value of monthDec is %d\n\n ", monthDec);
	printf("Value of February is %d\n ", February);
	printf("Value of December is %d \n", December);
}

Here we can notice that it displays the index values rather than displaying January, February etc. This type of declaring the datatype is useful when we know the number and values for data.