The British Library and four other “legal deposit libraries'” will have the right to collect and store everything that is published online in the UK.
It follows 10 years of planning and will also offer visitors access to material currently behind paywalls.
Richard Gibby from the British Library says there is a common belief that the average web page lasts just 75 day
The other institutions involved are the National Libraries of Scotland and Wales, the Bodleian Libraries in Oxford, the University Library, Cambridge and the Library of Trinity College, Dublin.
The archive will cover 4.8 million websites and will include magazines, books and academic journals as well as alternative sources of literature, news and comment such as Mumsnet, the Beano online, Stephen Hawking’s website, and the unofficial armed forces’ bulletin board, ARRSE.
Millions of tweets, Facebook status updates and even a blog about a bus shelter in Shetland are to be preserved for the nation.
Ben Sanderson from the British Library said while people may think information on the web lasts forever, huge amounts of research material has already disappeared.
Mr Sanderson explained that with much of public life having migrated to the online world, material that is now published physically gives only a part of the story and debate within modern Britain.
He said: “It will be impossible to tell for instance the story of the 2015 general election without accessing what appears on the web”.
The new databases will cover all areas of interest, for example the website Style Scout – a fashion blog documenting London Street Fashion – will give historians a snapshot of what people were wearing in 2013.
As part of the launch of the process, the British Library has commissioned a survey of the top 100 websites that ought to be preserved for historians and researchers.
Among the sites recommended to keep material from are eBay, Facebook, Twitter, Tripadvisor and Rightmove.
Some other lesser known ones include the Anarchist Federation, the Dracula Society and The Dreamcast Junkyard – a blog dedicated to the community of gamers who continue to play Dreamcast games online, despite the fact they were officially discontinued in 2002.
The British Library is also asking for advice from the public as to which websites should be preserved to give an accurate picture to future generations.
Jim Killock, executive director of the Open Rights Group, told the BBC News website: “The idea of the British Library preserving published content from UK websites is a great one.
“My concern is that a lot of Facebook comments are public and people don’t realise they’re publishing to the world. That’s Facebook’s fault, not the British Library’s – their user settings need to be changed in line with people’s expectations.
“Twitter, on the other hand, is avowedly public – it’s very clear you’re publishing to the world.”
It’s great that the powers that be have woken up to the fact that more things are going on online than offline. But as Jim Killock warns- be EVEN MORE WARY about what you post online as this initiative makes it even more important that your online activities stay legla, honest, decent and truthful.