I'm new to hive / hadoop and came to know that hive is schema on read. Which means you don't require to define schema while writing data into hdfs.
I have a query here.
Once mass data is built into hdfs, after that we need to apply some hive queries to read the data. And when reading, we should have defined schema already. So we come up with the design and build schema and put into metastore, create tables etc... Etc...
So in future we are gonna write data again into the same table, database inside hdfs. Isnt it like NOW WE ARE WRITING THE DATA AGAINST SCHEMA? so, will it still hold to be schema on read? Or schema on write also?????
Pls add points if I'm missing something here.
Note : 1.Basically, first time I didn't use hive ql to store data, just copy paste into hdfs.
2. future ly I'm stuffing data into hdfs using hive ql...
P. S : I'm not getting answer to these on Google, sof or quora.. Plz help :)
Does this help?