Find the element that Appears Once - Using BitSet

We can use different approaches:
1. Use two hashsets(as we don't need the real count): uniqueSet, duplicateSet
The first time we see the element, we put it into both sets, if we see it again, we remove it from uniqueSet.
2. We can use BitSet to reduce space - uniqueBitSet and duplicateBitSet
The first time we see the element, we put set its bit in uniqueBitSet and duplicateBitSet, if we see it again, we clear its bit in duplicateBitSet.
    // Using Bitset
    public static int findUnqiueNumber1(final int[] array) {
        final BitSet unique = new BitSet(), duplicate = new BitSet();
        for (final int number : array) {
            // first time see this number
            if (!duplicate.get(number)) {
                unique.set(number);
                duplicate.set(number);
            } else {
                // if see it again
                unique.clear(number);
            }
        }

        final int unqinueValue = unique.nextSetBit(0);

        if (unqinueValue == -1) {
            throw new NoUniqueException("No unique number");
        }
        final int nextUnqiueValue = unique.nextSetBit(unqinueValue + 1);
        if (nextUnqiueValue != -1) {
            throw new MoreThanOneUniqueException(String.format(
                    "More than 1 unique number found. We found at least %s, %s", unqinueValue, nextUnqiueValue));
        }

        return unqinueValue;
    }

    // Using HashSet
    public static int findUnqiueNumber2(final int[] array) {
        final Set<Integer> uniqueSet = new HashSet<>(), duplicateSet = new HashSet<>();
        for (final int number : array) {
            // first time see this number
            if (!duplicateSet.contains(number)) {
                uniqueSet.add(number);
                duplicateSet.add(number);
            } else {
                // if see it again
                uniqueSet.remove(number);
            }
        }

        if (uniqueSet.isEmpty()) {
            throw new NoUniqueException("No unique number");
        }
        if (uniqueSet.size() > 1) {
            throw new MoreThanOneUniqueException(
                    String.format("More than 1 unique number found. We found %s count", uniqueSet.size()));
        }
        final int unqinueValue = uniqueSet.iterator().next();
        return unqinueValue;
    }


    private static class MoreThanOneUniqueException extends RuntimeException {
        private static final long serialVersionUID = 1L;

        public MoreThanOneUniqueException(final String messge) {
            super(messge);
        }
    }
    private static class NoUniqueException extends RuntimeException {
        private static final long serialVersionUID = 1L;

        public NoUniqueException(final String messge) {
            super(messge);
        }
    }

    @Rule
    public ExpectedException thrown = ExpectedException.none();

    @Test
    public void test() {
        Assert.assertEquals(findUnqiueNumber2(new int[] {1}), 1);
        Assert.assertEquals(findUnqiueNumber2(new int[] {1, 2, 2}), 1);
        Assert.assertEquals(findUnqiueNumber2(new int[] {1, 2, 2}), 1);
    }

    @Test
    public void testNoUnqiue() {
        thrown.expect(NoUniqueException.class);
        findUnqiueNumber2(new int[] {});
    }

    @Test
    public void testNoUnqiue2() {
        thrown.expect(NoUniqueException.class);
        findUnqiueNumber2(new int[] {1, 1, 2, 2});
    }

    @Test
    public void testMoreThanONeUnqiue() {
        thrown.expect(MoreThanOneUniqueException.class);
        findUnqiueNumber2(new int[] {1, 2});
    } 
Related:
[CareerCup] 10.4 Find All Duplicates Elements
You have an array with all the numbers from 1 to N, where N is at most 32,000. The array may have duplicate entries and you do not know what N is. With only 4 kilobytes of memory available, how would you print all duplicate elements in the array?

It can also be solved by using BitSet. add i into bitset, if i happens again, print it out.
Bonus:
org.apache.commons.lang3.ArrayUtils.removeElements(int[], int...)
    public static int[] removeElements(final int[] array, final int... values) {
        if (isEmpty(array) || isEmpty(values)) {
            return clone(array);
        }
        final HashMap<Integer, MutableInt> occurrences = new HashMap<Integer, MutableInt>(values.length); // use MutableInt to avoid Integer creation again and again.
        for (final int v : values) {
            final Integer boxed = Integer.valueOf(v);
            final MutableInt count = occurrences.get(boxed);
            if (count == null) {
                occurrences.put(boxed, new MutableInt(1));
            } else {
                count.increment();
            }
        }
        final BitSet toRemove = new BitSet();
        for (final Map.Entry<Integer, MutableInt> e : occurrences.entrySet()) {
            final Integer v = e.getKey();
            int found = 0;
            for (int i = 0, ct = e.getValue().intValue(); i < ct; i++) {
                found = indexOf(array, v.intValue(), found);
                if (found < 0) {
                    break;
                }
                toRemove.set(found++);
            }
        }
        return (int[]) removeAll(array, toRemove);
    }
    static Object removeAll(final Object array, final BitSet indices) {
        final int srcLength = ArrayUtils.getLength(array);
        // No need to check maxIndex here, because method only currently called from removeElements()
        // which guarantee to generate on;y valid bit entries.
//        final int maxIndex = indices.length();
//        if (maxIndex > srcLength) { 
//            throw new IndexOutOfBoundsException("Index: " + (maxIndex-1) + ", Length: " + srcLength);
//        }
        final int removals = indices.cardinality(); // true bits are items to remove
        final Object result = Array.newInstance(array.getClass().getComponentType(), srcLength - removals);
        int srcIndex=0;
        int destIndex=0;
        int count;
        int set;
        while((set = indices.nextSetBit(srcIndex)) != -1){
            count = set - srcIndex;
            if (count > 0) {
               System.arraycopy(array, srcIndex, result, destIndex, count); // System.arraycopy is native and faster.
               destIndex += count;
            }
            srcIndex = indices.nextClearBit(set);
        }
        count = srcLength - srcIndex;
        if (count > 0) {
            System.arraycopy(array, srcIndex, result, destIndex, count);            
        }
        return result;
    }

java.util.BitSet.cardinality()
public int cardinality() {
    int sum = 0;
    for (int i = 0; i < wordsInUse; i++)
        sum += Long.bitCount(words[i]);
    return sum;
}    
Post a Comment

Labels

Java (159) Lucene-Solr (110) All (60) Interview (59) J2SE (53) Algorithm (37) Eclipse (35) Soft Skills (35) Code Example (31) Linux (26) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (15) Defects (14) Text Mining (14) J2EE (13) Network (13) PowerShell (11) Chrome (9) Continuous Integration (9) How to (9) Learning code (9) Performance (9) UIMA (9) html (9) Design (8) Dynamic Languages (8) Http Client (8) Maven (8) Security (8) Trouble Shooting (8) bat (8) blogger (8) Big Data (7) Google (7) Guava (7) JSON (7) Problem Solving (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) IDE (5) Lesson Learned (5) Miscs (5) Programmer Skills (5) System Design (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) OpenNLP (4) Project Managment (4) Python (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Firefox (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Google Drive (2) Gson (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Bit Operation (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Troubleshooting (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts