What is HASHING?
Hashing is a method for storing and
retrieving records from a database efficiently.
Hashing allows insert, delete, and
search for records based on a search key value.
If the hashing implemented properly
and all above operations could be performed in constant time i.e. O (1). This
is better than binary search on a sorted array of n records, O (log n) average
cost required to do an operation on a binary search tree.
To get time complexity of O (1),
designers need to give attention while implementing a hash system.
What
is HASH TABLE?
A hash system stores records or
elements in an array called a hash table.
(Or) an effective data structure for
implementing dictionaries (the store List of records).
What
is HASH FUNCTION?
Hashing works by performing a
computation on a search key K in a way that is intended to identify the
position (or index) in HASH TABLE that contains the record with key K.
The function that does this calculation
is called the hash function, and will be denoted by the letter h (k).
Hash function takes input as a key and returns the index or slot. A position in
the hash table is also known as a slot or Index.
Hashing is not good for applications
where multiple records with the same key value are permitted. Without collision
& overflow, search only takes O (1) time. Data size is not concerned. Security:
If you do not know the hash function, you
cannot get data
A good hash function satisfies
(approximately) the assumption of simple uniform hashing
Each key is equally likely to hash to any
of the “m” slots. Independently of where
any other key has hashed to
Direct Addressing:
With
Direct addressing, an element with key (i.e. here i, j, k) stored in slot i, j,
k.
With Hashing:
These
elements stored in slot h(k1), h(k2), h(k3)
Following are the
list of Techniques can be used for Hash Function:
Division Method:
- Mapping a key k into one of “m” slots by taking the remainder of k divided by m
h (k) = k mod m
Ex. m = 12, k
= 100, then h(k) = 4
Or hash Index = key % hash_Table_Size
- Prime number “m” may be good choice
Mid square Method:
Mapping
a key k into one of m slots by get the middle some digits from value k2
h (k ) = k2
get middle (log m) digits
Ex.
m = 10000,
k = 113586, log m = 4
h(k)
= 1135862 get middle 4 digits
= 1290 1779
369 get middle 4 digits
= 1779
Folding:
- Portions of the key are often recombined, or folded together
- Divide k into some sections, besides the last section, have same length. Then add these sections together.
a.
shift folding
b.
folding at the boundaries
H(k) = ∑ (section divided from k) by a or b
Ex,
k = 12320324111220, section length = 3
Shift
Folding -> 123 + 203 + 241+ 112 +
20 => 879
Folding
at the boundaries -> 123 + 302 + 241+ 211 + 20 => 897
It can be efficiently performed using bitwise
operations
Collision & Overflow
handling
Collision
When two
keys map to the same location in the hash table.
There are two ways
to avoid collisions:-
- Open hashing (called as separate Chaining)
- Closed hashing (called as Open Addressing) (linear probing, quadratic probing, double hashing)
The difference
between the two has to do with whether collisions are stored outside the table
(open hashing), or whether collisions result in storing one of the records at
another slot in the table (closed hashing).
Open Hashing: (or) Separate chaining
- All keys that map to the same hash value are kept in a list (or “bucket”)
- An array of linked list implementation.
The
simplest form of open hashing defines each slot in the hash table to be the
head of a linked list. All records that hash to a particular slot are placed on
that slot's linked list.
The
figure illustrates a hash table where each slot stores one record and a link
pointer to the rest of the list.
Worst-case
insert time = O (1)
ü insert into
the beginning of each link list
Worst
-case search time = Θ(n)
ü Every key
mapping to the same slot
ü Ex. h(1) =
h(2) = h(3) =...= h(n) = x
ü then search
key 1
Analysis:
Worst
case complexity of all these operations is O(n) In the average case, the
running time is O(1 +α) where load factor
α
|
=
|
n/m
|
|
n
|
=
|
number
of elements stored
|
|
m
|
=
|
number
of hash values or buckets
|
Θ (1 +α ) means ?
If the number of hash table slots is
at least proportional to the number of elements in the
table, we have n = O(m) and, α =
n/m = O(m)/m = O(1). Thus, searching takes constant time
on average.
Or hash value h(k) can be computed in
O(1) time. If n is O (m), the average case complexity of these operations
becomes O(1) !
Closed Hashing:
- No chaining, every key fits in the hash table
- All elements are stored in the hash table itself
- Avoids pointers; only computes the sequence of slots to be examined.
- Collisions are handled by generating a sequence of rehash values.
- linear probing,
- quadratic probing,
- double hashing
The below program designed for Uniform Hashing ( No collisions) used division method.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
/* For Any HASH Implementation a Perfect Hash Function is
* required to avoid collisions */
/* In real time time is it possible to have perfect Hash function
* Some times possible and may not possible also
* Then How to avoid collisions ( Having Same Index value) to store the values
* Use seperate chaining or Bucket Hashing
*/
typedef struct
{
int data;
}HASH_TABLE;
HASH_TABLE **hashTable;
int *value;
void intialise_values_and_hashTable(int maxNum, int hashTableSize)
{
int i;
/* Intialise the data */
value = (int *)malloc(maxNum * sizeof(int));
/* Intialise the Hash Table */
hashTable = (HASH_TABLE **)malloc(hashTableSize * sizeof(int));
for(i=0; i<maxNum; i++)
{
value[i] = i;
}
/* Allocate the Memory Pool of Data */
for(i=0; i<hashTableSize; i++)
{
hashTable[i] = (HASH_TABLE *)malloc(sizeof(HASH_TABLE));
/* reset the Data */
memset(hashTable[i], 0x00, sizeof(HASH_TABLE));
}
}
int hashImplementation(int maxNum, int hashTableSize)
{
int keyIndex, key, hashIndex;
printf(" Enter the Key Index to get appropriate Key (0 - %d )\n",
maxNum-1);
scanf("%d", &keyIndex);
/* Get the Hash Key, Ensure that hash key
* should be unique
* (e.g: For Key computation use << or >> operators)
*/
key = getKey(keyIndex, maxNum);
printf(" The Key : %d\n", key);
/* Compute the Hash Index
* This is one of the Important step in Hash Function
* Function can be computed based on key and HashTable Size
*/
hashIndex = hashFunction(key, hashTableSize);
printf(" The HashIndex : %d\n", hashIndex);
return hashIndex;
}
/*
* getKey fn
*/
int getKey(int keyIndex, int maxNum)
{
int i;
for(i=0; i<maxNum; i++)
{
if(keyIndex == i)
return value[i];
}
if(i == maxNum)
printf(" No Index Is found\n");
return 0;
}
/*
* hashIndex fn
*/
int hashFunction(int key, int hashTableSize)
{
int index;
index = key % hashTableSize;
return index;
}
void insertDataIntoHashTable(int info, int hashIndex)
{
if(!hashTable[hashIndex]->data)
{
hashTable[hashIndex]->data = info;
}
else
{
printf(" Data : %d already exists for hashIndex : %d \n:",
hashTable[hashIndex]->data, hashIndex);
}
}
void deleteDataFromHashTable(int hashIndex)
{
hashTable[hashIndex]->data = 0;
}
void displayHashTable(int hashTableSize)
{
int i;
for(i=0; i<hashTableSize; i++)
{
printf("Index : %d -> | data: %d\t |\n", i,
hashTable[i]->data);
}
}
/*
* Main()
*/
int main()
{
int i, maxNum, hashTableSize, info, hashIndex, ch;
printf("Enter the Max Number of Entries to be present in Hash Table\n");
scanf("%d", &maxNum);
printf("Enter the HASH TABLE SIZE\n");
scanf("%d", &hashTableSize);
intialise_values_and_hashTable(maxNum, hashTableSize );
do
{
printf("Enter the choice \n");
printf("1. Insert \n");
printf("2. Delete \n");
printf("3. Display \n");
printf("10. Exit \n");
scanf("%d", &ch);
switch(ch)
{
case 1:
{
printf(" Enter the Data to be inserted \n");
scanf("%d", &info);
hashIndex = hashImplementation(maxNum, hashTableSize);
/* Insert the data in to Hash Table
*/
insertDataIntoHashTable(info, hashIndex);
}
break;
case 2:
{
printf(" Enter the hashIndex ( Should not be > HashTableSize\n");
scanf("%d", &hashIndex);
deleteDataFromHashTable(hashIndex);
}
break;
case 3:
/* Display*/
displayHashTable(hashTableSize);
break;
default:
printf("Invalid choice \n");
break;
}
} while(ch != 10);
/* Free the Allocated Hash Table memory */
for(i=0; i<hashTableSize; i++)
{
free(hashTable[i]);
}
free(value);
free(hashTable);
}
The below program designed with collisions( Used Open Hashing, separate chaining).
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
/* For Any HASH Implementation a Perfect Hash Function is
* required to avoid collisions */
/* In real time time is it possible to have perfect Hash function
* Some times possible and may not possible also
* Then How to avoid collisions ( Having Same Index value) to store the values
* Use seperate chaining or Bucket Hashing
*/
struct newNode
{
int data;
int key;
struct newNode *next;
};
typedef struct newNode newData;
typedef struct
{
int count;
struct newNode *new;
}HASH_TABLE;
HASH_TABLE **hashTable;
int *value;
/* Bucket Hashing */
HASH_TABLE *head, *last;
void intialise_values_and_hashTable(int maxNum, int hashTableSize)
{
int i;
/* Intialise the data */
value = (int *)malloc(maxNum * sizeof(int));
/* Intialise the Hash Table */
hashTable = (HASH_TABLE **)malloc(hashTableSize * sizeof(int));
for(i=0; i<maxNum; i++)
{
value[i] = rand();
}
/* Allocate the Memory Pool of Data */
for(i=0; i<hashTableSize; i++)
{
hashTable[i] = (HASH_TABLE *)malloc(sizeof(HASH_TABLE));
/* reset the Data */
memset(hashTable[i], 0x00, sizeof(HASH_TABLE));
}
}
int hashImplementation(int maxNum, int hashTableSize)
{
int keyIndex, key, hashIndex;
printf(" Enter the Key Index to get appropriate Key (0 - %d )\n",
maxNum-1);
scanf("%d", &keyIndex);
/* Get the Hash Key, Ensure that hash key
* should be unique
* (e.g: For Key computation use << or >> operators)
*/
key = getKey(keyIndex, maxNum);
printf(" The Key : %d\n", key);
/* Compute the Hash Index
* This is one of the Important step in Hash Function
* Function can be computed based on key and HashTable Size
*/
hashIndex = hashFunction(key, hashTableSize);
printf(" The HashIndex : %d\n", hashIndex);
return hashIndex;
}
/*
* getKey fn
*/
int getKey(int keyIndex, int maxNum)
{
int i;
for(i=0; i<maxNum; i++)
{
if(keyIndex == i)
return value[i];
}
if(i == maxNum)
printf(" No Index Is found\n");
return 0;
}
/*
* hashIndex fn
*/
int hashFunction(int key, int hashTableSize)
{
int index;
index = key % hashTableSize;
return index;
}
newData *createNewData(int info, int keyData)
{
newData *new = NULL;
new = (newData *)malloc(sizeof(newData));
if(new)
{
new->data = info;
new->key = keyData;
new->next = 0;
}
return new;
}
void insertDataIntoHashTable(int info, int hashIndex, int keyData)
{
newData *newInfo = NULL, *temp =NULL;
/* Create New Data */
newInfo = createNewData(info, keyData);
if(!hashTable[hashIndex]->new)
{
hashTable[hashIndex]->new = newInfo;
}
else
{
temp = hashTable[hashIndex]->new;
while(temp->next!=0)
{
temp = temp->next;
}
temp->next = newInfo;
}
hashTable[hashIndex]->count++;
}
void deleteDataFromHashTable(int hashIndex, int keyData)
{
newData *head, *temp, *prev, *current= NULL;
int count = 0;
current = hashTable[hashIndex]->new;
head = hashTable[hashIndex]->new;
while(current)
{
if(current->key == keyData)
{
if(current == head)
{
printf(" It is going inside loop \n");
temp = current;
current = current->next;
hashTable[hashIndex]->new = current;
free(temp);
temp = NULL;
return;
}
else if(current->next == 0)
{
temp = current;
prev->next = 0;
free(temp);
temp= NULL;
return;
}
else
{
temp = current;
prev->next = current->next;
free(temp);
temp= NULL;
return;
}
}
prev = current;
current = current->next;
count++;
}
if(count == hashTable[hashIndex]->count)
{
printf("Data is not found in the hashTable \n");
}
}
void displayHashTable(int hashTableSize)
{
int i;
newData *temp=NULL;
for(i=0; i<hashTableSize; i++)
{
temp = hashTable[i]->new;
while(temp)
{
printf(" Key :%d Data : %d and hashIndex : %d\n", temp->key,
temp->data, i);
temp = temp->next;
}
}
}
/*
* Main()
*/
int main()
{
int i, keyData, maxNum, hashTableSize, info, hashIndex, ch;
printf("Enter the Max Number of Entries to be present in Hash Table\n");
scanf("%d", &maxNum);
printf("Enter the HASH TABLE SIZE\n");
scanf("%d", &hashTableSize);
intialise_values_and_hashTable(maxNum, hashTableSize );
do
{
printf("Enter the choice \n");
printf("1. Insert \n");
printf("2. Delete \n");
printf("3. Display \n");
printf("10. Exit \n");
scanf("%d", &ch);
switch(ch)
{
case 1:
{
printf(" Enter the Data to be inserted \n");
scanf("%d", &info);
printf(" Enter the key Data to be inserted \n");
scanf("%d", &keyData);
hashIndex = hashImplementation(maxNum, hashTableSize);
/* Insert the data in to Hash Table
*/
insertDataIntoHashTable(info, hashIndex, keyData);
}
break;
case 2:
{
printf(" Enter the hashIndex ( Should not be > HashTableSize\n");
scanf("%d", &hashIndex);
printf("Enter the KeyData based on tat deletion \n");
scanf("%d", &keyData);
deleteDataFromHashTable(hashIndex, keyData);
}
break;
case 3:
/* Display*/
displayHashTable(hashTableSize);
break;
default:
printf("Invalid choice \n");
break;
}
} while(ch != 10);
/* Free the Allocated Hash Table memory */
for(i=0; i<hashTableSize; i++)
{
free(hashTable[i]);
}
free(value);
free(hashTable);
}
#include<stdlib.h>
#include<string.h>
/* For Any HASH Implementation a Perfect Hash Function is
* required to avoid collisions */
/* In real time time is it possible to have perfect Hash function
* Some times possible and may not possible also
* Then How to avoid collisions ( Having Same Index value) to store the values
* Use seperate chaining or Bucket Hashing
*/
struct newNode
{
int data;
int key;
struct newNode *next;
};
typedef struct newNode newData;
typedef struct
{
int count;
struct newNode *new;
}HASH_TABLE;
HASH_TABLE **hashTable;
int *value;
/* Bucket Hashing */
HASH_TABLE *head, *last;
void intialise_values_and_hashTable(int maxNum, int hashTableSize)
{
int i;
/* Intialise the data */
value = (int *)malloc(maxNum * sizeof(int));
/* Intialise the Hash Table */
hashTable = (HASH_TABLE **)malloc(hashTableSize * sizeof(int));
for(i=0; i<maxNum; i++)
{
value[i] = rand();
}
/* Allocate the Memory Pool of Data */
for(i=0; i<hashTableSize; i++)
{
hashTable[i] = (HASH_TABLE *)malloc(sizeof(HASH_TABLE));
/* reset the Data */
memset(hashTable[i], 0x00, sizeof(HASH_TABLE));
}
}
int hashImplementation(int maxNum, int hashTableSize)
{
int keyIndex, key, hashIndex;
printf(" Enter the Key Index to get appropriate Key (0 - %d )\n",
maxNum-1);
scanf("%d", &keyIndex);
/* Get the Hash Key, Ensure that hash key
* should be unique
* (e.g: For Key computation use << or >> operators)
*/
key = getKey(keyIndex, maxNum);
printf(" The Key : %d\n", key);
/* Compute the Hash Index
* This is one of the Important step in Hash Function
* Function can be computed based on key and HashTable Size
*/
hashIndex = hashFunction(key, hashTableSize);
printf(" The HashIndex : %d\n", hashIndex);
return hashIndex;
}
/*
* getKey fn
*/
int getKey(int keyIndex, int maxNum)
{
int i;
for(i=0; i<maxNum; i++)
{
if(keyIndex == i)
return value[i];
}
if(i == maxNum)
printf(" No Index Is found\n");
return 0;
}
/*
* hashIndex fn
*/
int hashFunction(int key, int hashTableSize)
{
int index;
index = key % hashTableSize;
return index;
}
newData *createNewData(int info, int keyData)
{
newData *new = NULL;
new = (newData *)malloc(sizeof(newData));
if(new)
{
new->data = info;
new->key = keyData;
new->next = 0;
}
return new;
}
void insertDataIntoHashTable(int info, int hashIndex, int keyData)
{
newData *newInfo = NULL, *temp =NULL;
/* Create New Data */
newInfo = createNewData(info, keyData);
if(!hashTable[hashIndex]->new)
{
hashTable[hashIndex]->new = newInfo;
}
else
{
temp = hashTable[hashIndex]->new;
while(temp->next!=0)
{
temp = temp->next;
}
temp->next = newInfo;
}
hashTable[hashIndex]->count++;
}
void deleteDataFromHashTable(int hashIndex, int keyData)
{
newData *head, *temp, *prev, *current= NULL;
int count = 0;
current = hashTable[hashIndex]->new;
head = hashTable[hashIndex]->new;
while(current)
{
if(current->key == keyData)
{
if(current == head)
{
printf(" It is going inside loop \n");
temp = current;
current = current->next;
hashTable[hashIndex]->new = current;
free(temp);
temp = NULL;
return;
}
else if(current->next == 0)
{
temp = current;
prev->next = 0;
free(temp);
temp= NULL;
return;
}
else
{
temp = current;
prev->next = current->next;
free(temp);
temp= NULL;
return;
}
}
prev = current;
current = current->next;
count++;
}
if(count == hashTable[hashIndex]->count)
{
printf("Data is not found in the hashTable \n");
}
}
void displayHashTable(int hashTableSize)
{
int i;
newData *temp=NULL;
for(i=0; i<hashTableSize; i++)
{
temp = hashTable[i]->new;
while(temp)
{
printf(" Key :%d Data : %d and hashIndex : %d\n", temp->key,
temp->data, i);
temp = temp->next;
}
}
}
/*
* Main()
*/
int main()
{
int i, keyData, maxNum, hashTableSize, info, hashIndex, ch;
printf("Enter the Max Number of Entries to be present in Hash Table\n");
scanf("%d", &maxNum);
printf("Enter the HASH TABLE SIZE\n");
scanf("%d", &hashTableSize);
intialise_values_and_hashTable(maxNum, hashTableSize );
do
{
printf("Enter the choice \n");
printf("1. Insert \n");
printf("2. Delete \n");
printf("3. Display \n");
printf("10. Exit \n");
scanf("%d", &ch);
switch(ch)
{
case 1:
{
printf(" Enter the Data to be inserted \n");
scanf("%d", &info);
printf(" Enter the key Data to be inserted \n");
scanf("%d", &keyData);
hashIndex = hashImplementation(maxNum, hashTableSize);
/* Insert the data in to Hash Table
*/
insertDataIntoHashTable(info, hashIndex, keyData);
}
break;
case 2:
{
printf(" Enter the hashIndex ( Should not be > HashTableSize\n");
scanf("%d", &hashIndex);
printf("Enter the KeyData based on tat deletion \n");
scanf("%d", &keyData);
deleteDataFromHashTable(hashIndex, keyData);
}
break;
case 3:
/* Display*/
displayHashTable(hashTableSize);
break;
default:
printf("Invalid choice \n");
break;
}
} while(ch != 10);
/* Free the Allocated Hash Table memory */
for(i=0; i<hashTableSize; i++)
{
free(hashTable[i]);
}
free(value);
free(hashTable);
}
No comments:
Post a Comment