Introduction

Notes during problem solving.

Problem Description

You are given a 0-indexed string array words.
Two strings are similar if they consist of the same characters.

  • For example, "abca" and "cba" are similar since both consist of characters 'a', 'b', and 'c'.
  • However, "abacba" and "bcfd" are not similar since they do not consist of the same characters.
    Return the number of pairs (i, j) such that 0 <= i < j <= word.length - 1 and the two strings words[i] and words[j] are similar.

Code

void bitmask_init(char **words, int *bitmask, const int length_itr);
int asc_cmp(const void *a, const void *b);
void sort_wrapper(int *bitmask, const int length_itr);
int cmp_word_similarity(int *bitmask, int length_itr);
int similarPairs(char **words, int wordsSize);

// provides a unique bit mask that keeps count of which letters are used.
void
bitmask_init(char **words, int *bitmask, const int length_itr)
{
    for (int i = 0; i < length_itr; i++) {
        int word_length = strlen(words[i]);

        for (int j = 0; j < word_length; j++) {
            bitmask[i] = bitmask[i] | (1 << (words[i][j] - 'a'));
        }
    }
}

int
asc_cmp(const void *a, const void *b)
{
    return (*(int *)a - *(int *)b);
}

void
sort_wrapper(int *bitmask, const int length_itr)
{
    qsort(bitmask, length_itr, sizeof(int), asc_cmp);
}

int
cmp_word_similarity(int *bitmask, int length_itr)
{
    int similar_cnt = 0;

    for (int i = 0; i < length_itr; i++) {
        for (int j = i + 1; j < length_itr; j++) {
            if (bitmask[i] == bitmask[j]) {
                similar_cnt++;
            }
        }
    }

    return similar_cnt;
}

int
similarPairs(char **words, int wordsSize)
{
    int *bitmask = (int *)malloc(sizeof(int) * wordsSize);
    memset(bitmask, 0, sizeof(int) * wordsSize);

    bitmask_init(words, bitmask, wordsSize);
    sort_wrapper(bitmask, wordsSize);
    int similar_cnt = cmp_word_similarity(bitmask, wordsSize);

    free(bitmask);
    return similar_cnt;
}

Solution Explanation

After allocating memory to bitmask, it sets all values to 0 using memset().
Then, it is initialized by the OR operation. This sets the values corresponding to the alphabets to 1. Even for duplicates.
Sorts the bitmask to make it easier for comparison. Although the current algorithm doesn’t optimize on this.

  • Need a more effecient algorithm to compare. Utilize the sort.
    Finally, counts the number of identical bitmaps.

Code Explanation

  • Syntax
    Followed the GNU Coding Standards.
  • Functions
    • bitmask_init(): initializes the bit map by shifting 1s into the corresponding alphabet’s value.
    • asc_cmp(): a comparison function. Provides a ASC sort.
    • sort_wrapper(): a wrapper function of the qsort(). Provides extensibility of the sort function.
    • cmp_word_similarity(): compares the number of identical bitmasks.
    • similarPairs(): the starting point of the functionality. finds the similar pairs of the given string list.
  • Variables
    • bitmask: an int array of bitmasks. Each string has a bitmask that represents each distinct alphabet in it.
    • similar_cnt: holds the number of “similar” pairs of strings.
    • wordsSize: size of the array words.
    • length_itr: identical value as wordsSize. Personally, I don’t like the name, but tried to name it a generic one.
  • Etc.
    • other values were pre-defined, and used for consistency.

Conclusion

Second entry of my [PS] series.
During a call with forYourJoy, I again realized how important it is to practice my basics.
The actual reason I started this is because of the thought, “What is my greatest fear?”
My greatest fear is someone asking implement a simple file I/O program using your favourite language.

  1. I don’t know what my favourite language is..
  2. I wasn’t sure if I can implement the program..

File I/O is the one of the most fundamental strutures in coding.
So that is why I started LeetCoding from bit manipulation -> pointers -> file I/O.
Bon Voyage:)