Design CareerMar 31, 2018
NewVTMt00

Which data structure to use

Given a message flow of numbers collected from a async process (the source is a incremental generator with gaps) print the data in continuous interval format. if the elements received so far are  5, 3, 15, 10, 17, 13, 6,  9, 7, 4, 8, 14 [3, 10], [13,15], [17, 17] what data structure would you use to store the incoming messages ? what is the time complexity of accomodating a new number. For example if element 16 comes in above sequence then the interval display would be [3,10] , [13,17]. Can you implement a O(LogN) technique to incorporate new element ?

Add a comment
Google quora_TW Mar 31, 2018

A balanced binary search tree where each node is an interval and the key is the interval lower bound would do just fine. How to perform updates is left as an exercise for the reader.

Lyft CygnusX2 Mar 31, 2018

Binary search on a list could work

McAfee baqko45 Mar 31, 2018

Segment tree? Maybe

LinkedIn Ybevis Mar 31, 2018

Hash table

Microsoft yMcg47 Mar 31, 2018

In fact a hash table would work better here for run time. You get o(1) insertion time and o(m) readout where m is the number of intervals. This is in contrast to o(log n) insertion and o(m) readout using a tree with interval node and min key. Homework: no free lunch. What are you sacrificing here?

LinkedIn ufnppw Mar 31, 2018

Hashtable works if the stream has only unique integers. If that's not a guarantee then find and union would be better

Lyft meepleWork Mar 31, 2018

What about union find tree? New number n, query for n-1 and n+1. Update ranges accordingly or add new node.

Amazon CorrectBat Apr 1, 2018

I don’t understand, are you updating the continuous interval every time it updates?