aced/README

                                ACED PROJECT

This repository is (part of) my work for the ``Arquitecturas de Elevado
Desempenho'' class. See the doc/proposal.pdf document for more information on
the problem statement.


[[Project Structure]]

root
 ├ aux          Auxiliary scripts.
 ├ contrib      External C libraries.
 ├ data         Data for use as input or embedding in source.
 ├ doc          Extra documentation related to the project.
 ├ includes     Header files.
 ├ obj          Compiled object files.
 └ src          Root project source files.

The directory is essentially a C project, with auxiliary documentation and
scripts. The main executable can be compiled by running the corresponding
[sane] recipe:

    python3 make.py link

This will produce a DEBUG [main.debug.exe] executable in the root directory.
If a RELEASE environment variable is set, a RELEASE [main.release.exe]
executable is produced instead, i.e., to produce a release executable, run:

    RELEASE=1 python3 make.py link

See all available recipes by running

    python3 make.py --list

and read more about [sane]: at https://github.com/mikeevmm/sane.


[[Use Instructions]]

To use, pipe the input (structured according to the source specification) into
the executable, while specifying the number of outputs and inputs, in that
order, followed by a bin size (see the following section for details on the
meaning of the bin size). E.g.,

    ./main.release.exe 2 2 2 2 48 < data/2222_inq.txt


[[Notes on the Approach Taken]]

Originally, pairs were straightforwardly compared. This was too slow to tackle
even, for example [data/3332_inq.txt]. (Cf. [data/README].) So, instead, I'm now
trying a divide-and-conquer approach, now explained.

Consider the input to be N lines. From the ground truths, we know that the
number of non-equivalent rows, which we will denote k, is such that k << N.
So consider a subset of L lines (k << L < N). Without loss of generality, take
N/L to be an integer. For each subset L, at most k lines are non-equivalent.
We consider a "reduction" within each block, and then comparisons across
blocks: if there are P permutations to be considered (see [doc/proposal.pdf]
for details) this means, worst-case scenario, less than

    (N/L)L²P + (N/2L)k²P + (N/4L)k²P + ... + (N/L)/2^log(N/L) k²P
  = (N/L)L²P + (N/L)k²P(1 - L/N)
  = NP[L + k²/L (1 - L/N)]                                              (1)

operations. Compare this with the "straightforward" mode of operation, where

    N²P                                                                 (2)

operations are required. Clearly, the divide-and-conquer approach has an
advantage as long as the term in square brackets in (1) is significantly smaller
than N.

In pseudo-code:

    block_size is provided
    unique := []
    sizes := []

    base round:
        # distribute loop
        for block of rows of size block_size:
            local_unique := unique rows in local block
            push local_unique to unique
            push len(local_unique) to sizes

    reductions:
        while len(sizes) > 1:
            unique' := []
            sizes' := []

            # distribute loop
            for pair of blocks in (unique, sizes): # Odd ones out are ignored
                local := concatenation of pair of blocks
                local_unique = unique rows in local
                push local_unique to unique'
                push len(local_unique) to sizes'

            unique := unique'
            sizes := sizes'

    done