Skip to main content
Recent Posts
11
Hosting / Re: Offshore / Bulletproof Hosting Links
Last post by Admin -
You are not allowed to view links. Register or Login
Super Duper Bulletproof Host For Low Bandwidth Projects
They host infamous Maza hacking forum.
Please note the small print at the bottom though. Says only 300-500GB of International Bandwidth. Not good for high bandwidth sites or projects.
You are not allowed to view links. Register or Login
13
World News / Fund Elon Musk to Nuke Mars?
Last post by Thomas -
I'm not sure what kind of news this is, if it qualifies as world news?
But you can buy 'Nuke Mars' T-Shirts.
I hope the profits will be used to buy nukes, but those are expensive so buy up people!

If a man asks you money to nuke an uninhabitable planet to make it habitable, how much would you give him would you do it?

Buy T-Shirt: You are not allowed to view links. Register or Login
Slashdot: You are not allowed to view links. Register or Login
Nuke Mars tweet: You are not allowed to view links. Register or Login

Also a poll, would you like us to nuke mars?

To continue discussion:
Are there any  scientists here that know enough science to debunk that this is possible? (making mars habitable via nuclear warheads)
I've read you could heat up the planet BUT you have nothing to shield it from solar winds and all that horrible stuff, so it wouldn't last very long.
14
Tech News / Can JPEG XL Become the Next Free and Open Image Format?
Last post by Admin -
You are not allowed to view links. Register or Login

Quote
The JPEG XL Image Coding System (ISO/IEC 18181) has a richer feature set than existing codecs and can deliver images with similar quality at a third of the size of widely used alternatives. It is designed with responsive web design in mind, so that content renders well on a wide range of devices. The JPEG XL coding tools include variable-size DCT, nonlinear Haar transforms, multiresolution encoding, adaptive quantization, adaptive loop filters and context modeling.

JPEG XL includes several features that help transition from the legacy JPEG format. Existing JPEG files can be losslessly transcoded to JPEG XL, while significantly reducing their size. A lightweight lossless conversion process back to JPEG ensures compatibility with existing JPEG-only clients such as older generation phones and browsers. Thus it is easy to migrate to JPEG XL, because servers can store a single JPEG XL file to serve both JPEG and JPEG XL clients. JPEG XL decoders can perform enhancement that would improve image quality when dealing with the legacy JPEG format. JPEG XL encoders may also choose to add a small amount of additional information to further enhance the quality of decoded images, while remaining backward-compatible with existing legacy JPEG decoders.

JPEG XL is also designed to meet the needs of high-quality imaging and professional photography. A color-managed processing pipeline with full 32 bit per channel precision enables support for wide-color-gamut/high-dynamic-range images. JPEG XL reaches high compression efficiency at visually lossless quality (as defined in ISO/IEC 29170-2) using psychovisual modeling plugins.

JPEG XL is designed for efficient decoding even in software, with parallel and SIMD-friendly coding tools. JPEG XL compares favorably with contemporary coding solutions in terms of complexity.

JPEG XL further includes features such as animations, alpha channels, lossless and progressive coding to support a wide range of use cases including but not limited to photo galleries, e-commerce, social media, user interfaces and cloud storage. To enable novel applications, it also adds support for 360 degree images, image bursts, large panoramas/mosaics, and printing.

You are not allowed to view links. Register or Login
You are not allowed to view links. Register or Login
15
Tech News / Cloudflare Flags Copyright Lawsuits as Potential Liabilities Ahead of IPO
Last post by Admin -
You are not allowed to view links. Register or Login

You are not allowed to view links. Register or Login

Quote
Cloudflare, the CDN company currently serving around 20 million Internet domains, sites, applications and APIs, has filed to go public. In its statement, the company warns that the activities of some of its customers, which include pirate sites, could expose it to significant copyright infringement liabilities in the future.

As a CDN and security company, Cloudflare currently serves around 20 million “Internet properties”, ranging from domains and websites through to application programming interfaces (APIs) and mobile applications.

At least hundreds of those properties, potentially more, are considered ‘pirate’ platforms by copyright groups, which has resulted in Cloudflare being sucked into copyright infringement lawsuits due to the activities of its customers.

On Thursday, Cloudflare filed to go public by submitting the required S-1 registration statement. It contains numerous warnings that copyright infringement lawsuits, both current and those that may appear in the future, could present significant issues of liability for the company.

Noting that some of Cloudflare’s customers may use its services in violation of the law, the company states that existing laws relating to the liability of service providers are “highly unsettled and in flux”, both in the United States and further afield.

“For example, we have been named as a defendant in a number of lawsuits, both in the United States and abroad, alleging copyright infringement based on content that is made available through our customers’ websites,” the filing reads.

“There can be no assurance that we will not face similar litigation in the future or that we will prevail in any litigation we may face. An adverse decision in one or more of these lawsuits could materially and adversely affect our business, results of operations, and financial condition.”

Cloudflare goes on to reference the safe harbor provisions of the DMCA, noting that they may not offer “complete protection” for the company or could even be amended in the future to its detriment.

“If we are found not to be protected by the safe harbor provisions of the DMCA, CDA [Communications Decency Act] or other similar laws, or if we are deemed subject to laws in other countries that may not have the same protections or that may impose more onerous obligations on us, we may face claims for substantial damages and our brand, reputation, and financial results may be harmed. Such claims may result in liability that exceeds our ability to pay or our insurance coverage,” Cloudflare warns.

As a global company, it’s not only US law the company has to consider. Cloudflare references the recently-approved Copyright Directive in the EU, noting that also has the potential to expose Cloudflare and other online platforms to liability.

As recently as last month and in advance of any claims under that particular legislation, Cloudflare experienced an adverse ruling in an Italian court. Local broadcaster RTI successfully argued that Cloudflare can be held liable if it willingly fails to act in response to copyright infringement notices. In addition, Cloudflare was ordered to terminate the accounts of several pirate sites.

Of course, it’s not uncommon for S-1 filings to contain statements that can be interpreted as impending doom, since companies are required to be frank about their business’s prospects. However, with single copyright cases often dealing with millions of dollars worth of alleged infringement, Cloudflare’s appraisal of the risks seems entirely warranted.

Cloudflare’s S-1 filing can be viewed: You are not allowed to view links. Register or Login

Related Topic: You are not allowed to view links. Register or Login
16
Tech News / A Deep Dive Into AMD’s Rome Epyc Architecture
Last post by Admin -
You are not allowed to view links. Register or Login

You are not allowed to view links. Register or Login

Quote
In any chip design, the devil – and the angel – is always in the details. AMD has been burned by some architectural choices it has made with Opteron processors in the past, where assumptions about how code might exploit the hardware did not pan out as planned. The company seemed intent on not repeating mistakes with the follow-ons with the venerable Opteron processors, which had excellent designs at first, with the second generation of Epyc server chips.

Time – and customers – will tell, but this derivative kicker implemented in a much-improved multichip design with more advanced etching processes for the cores seems to be hitting the server market with precisely what it wants when it needs it most. And this bodes well for the future of the Epyc chips as an alternative to current and future Xeon chips from Intel.

We have been itching to get into the architectural details of the new “Rome” Epyc server chips, which we covered at the launch last week with the basic feeds, speeds, slots, watts, and pricing. Now, let’s jump into the architectural details of the Rome processors with Mike Clark, lead architect of the Zen cores and a corporate fellow at AMD as well. In many ways, Rome, with its Zen 2 core and mixed process multichip module design, is the processor that AMD must wish it could have put into the field two years ago. Everything about it is better, and it all starts with the etching of the cores and their associated L1 and L2 caches in the 7 nanometer processes from fab partner Taiwan Semiconductor Manufacturing Corp.

“It’s nice to be in the lead in process technology,” Clark said with a wry laugh, adding that Intel and AMD would be leapfrogging each other in the coming years so this victory would not be permanent even if it is undeniable and strategic right now. “That 7 nanometer process drives significant improvements. Interestingly, it gives us 2X the transistor density, but the frequency actually took a lot of work with TSMC and the tool guys. Typically, when you go into a new technology, the frequency goes down, you lose Vmax, and it takes some time to get that frequency back up. But we were able to work with them to create a really good frequency story for 7 nanometers and hold the power the same. And, of course, if you look at the transistors the other way, you can get half the power at the same performance level.”

Instructions per clock, or IPC, is also a big part of the Rome architecture. Jumping from the “Excavator” cores used in the last Opterons from several years ago to the Zen 1 cores used in the “Naples” Epyc chips, AMD was able to increase IPC by 50 percent on a constant clock basis, which is a huge jump. And akin to what Arm is promising with its “Ares” Neoverse designs. Arm is actually projecting a 60 percent increase in IPC, but to be fair, neither the Excavator Opterons nor the Cortex-A72 were very strong in IPC to begin with – at least not compared to a Xeon core from Intel. Now, AMD and Arm are catching up, and with the Zen 2 cores used in Rome, AMD is adding another 15 percent more IPC, clock for clock. Intel’s generational IPC improvements have been between 5 percent and 10 percent, or about half that rate on average.

Clark said that when IPC goes up, chip architects often pay for that with higher power consumption, but that with the Zen 2 core design the goal was to keep it power neutral compared to Zen 1 in Naples. As it turns out, the Rome engineers tightened the screws and were able to reduce power consumption of the core by 10 percent, above and beyond what was attained through the shrink in process from 14 nanometers with Naples to the 7 nanometers used for the Zen 2 core complexes in Rome. One the big ways this was accomplished was by doubling the op cache in the core, which helped save power and also increased performance.

In fact, AMD actually shrank the L1 instruction cache on each Zen 2 core, from 64 KB back to 32 KB, and gave that transistor area back to the op and branch prediction units and also used some of it to add a third address generation unit. The associativity of the L1 data and instruction caches (both at 32 KB) was doubled to eight-way, and AMD doubled the floating point data path width and then doubled up the L1 cache bandwidth to keep up with it. (Clark said that an eight-way associative L1 cache at 64 KB was going to eat far too much power, and with 64 cores, that was going to be a big problem.) The L3 cache was doubled up on each chiplet to 16 MB a pop, and with twice as many chiplets on the package that is four times the L3 cache capacity, at 256 MB, as what was in the Naples processor. It is not precisely doubling of everything, but an attempt to get things into a better balance as the core count and chiplet count doubled up. This includes branch prediction, instruction fetching, and instruction decode units, as you can see here:

“We like features that improve both power and performance,” Clark elaborated. “Being on the right path more often is important because the worst use of power is executing instructions that you are just going to throw away. We are not throwing work away after we figure out dynamically that we were wrong to do it. This definitely burns more power on the front end, but it pays dividends on the back end.”

That brings us to the integer and floating point instruction units in the Zen 2 cores.

On the integer front, the arithmetic logic unit (ALU) count remains the same at four, but the address generation unit (AGU) count in the Zen 2 core is boosted by one to a total of three. The schedulers for the ALUs and AGUs are both improved, and the register file and reorder buffers are both boosted in size, too. And the fairness of the algorithms controlling simultaneous multithreading (SMT) with regards to the ALUs and AGUs has been tweaked as well to deal with imbalances in the Zen 1 design.

Intel, of course, implemented a very elegant 512-bit wide AVX-512 vector unit in the “Knights Landing” Xeon Phi processors four years ago, and brought a variant of it – some would say a less elegant variant because it is harder to keep it fed due to the way it was implemented – to the “Skylake” Xeon SP processors and that has been brought forward essentially unchanged with the current “Cascade Lake” Xeon SP chips, excepting the ability to cram half precision instructions through it for machine learning inference workloads.

Clark said that AMD was looking at possibly doing 512-bit vectors in future Epyc chips, but at this point was not convinced that just adding wider vectors was the best way to use up the transistor budget. For one thing, Clark added that there are still a lot of floating point routines that are not parallelizable to 512 bits – and sometimes not even to 256 bits or 128 bits, for that matter – so it is a question of when moving to 512 bits on the vector engines in the Epyc line makes sense. AMD will probably be a fast follower and do something akin to the DLBoost machine learning inference instructions, we reckon. Perhaps that capability is already in the architecture, waiting to be activated at some future date when the software stack is ready for it.

With the Zen 1 core, which had a pair of 128 bit vectors, it took two operations to do an AVX-256 instruction, but Zen 2 can run that AVX-256 instruction in one clock; this obviously takes a lot less power. A double precision multiply took four cycles on Zen 1 and it takes only three cycles on Zen 2, which improves the throughput and power efficiency of the floating point units. (IPC figures cited above are for integer instructions, not floating point ones.)

As for the caches feeding the Zen 2 cores, all of the structures supporting the caches are bigger and provide more throughput, driving up that IPC:

Here’s what the Zen 2 CPU complex and cache hierarchy looks like:

That increased L2 cache in each core and L3 caches across the cores are a key to allowing that potential IPC in the Zen 2 core to be actualized, because as Clark correctly puts it: “The best way to reduce the latency to memory is to not go there in the first place.”

Add it all up, and you put eight CPU complexes and the I/O and memory hub – a total of nine chips – onto the package to make a top-end Rome Epyc. Lower bin SKUs have fewer core chiplets on the package and sometimes fewer cores activated on each die present, and this yields the breadth of the Rome Epyc 7002 series chips, as we detailed last week.

This is a teardown of the Naples and Rome MCMs, which obviously are architected in very different ways:

There are some important changes with the second generation Infinity Fabric variant of PCI-Express that is used to link the chiplets in the Naples and Rome sockets, respectively, to each other. The Naples chiplets could do a 16 byte read and a 16 byte write across the Infinity Fabric in one clock – FCLK in the fine print is short for fabric clock – while the Infinity Fabric in the Rome chips can do a 32 byte read and a 16 byte write per fabric clock.

While the Rome chips plug into the same sockets as the Naples chips, the way the elements are lashed together inside of that socket is very different. The memory controllers are moved off the CPU complex chiplets and onto that central hub, which is etched in 14 nanometer processes where it runs better than it would at 7 nanometers because I/O and memory have to push signals off the package and into the motherboard where DRAM and PCI-Express peripherals plug in. There are a total of eight DDR4 memory controllers on this hub chip, the same number in total that were on the Naples complex; both support one DIMM per channel and have two channels per controller, but Rome memory runs slightly faster – 3.2 GHz versus 2.67 GHz – and therefore with all memory slots filled, yields a maximum of 410 GB/sec of peak memory bandwidth per socket. That’s 45 percent higher than the Cascade Lake Xeon SP processor, which has six memory controllers for a total of 282 GB/sec of memory bandwidth running at 2.93 GHz and 21 percent higher than the 340 GB/sec that Naples turns in running that 2.67 GHz DRAM. (Those are ratings for two-socket servers.)

The real big change with the Rome Epycs, and one that is going to have a beneficial effects on performance for a lot of different workloads, is the way NUMA domains are created in the chips and the fewer NUMA hops – distances in the chart below – that is required to move from one part of the processor complex to another. Take a look:

This is basically a NUMA server, with that central hub being a chipset to link the chiplets (sockets in this analogy) together into a baby shared memory system using non-uniform memory access techniques to lash the caches and main memories together.

With the Naples chips, there were three different distances from any one die to another, which is where the memory was hanging. There was one hop to two adjacent dies, and sometimes two hops to the die diagonally across and three to the dies in the second socket in a two socket setup. Now, there are two NUMA domains and only two different distances. It is one hop from one chiplet through the central hub to the memory attached to any processor, and then one other hop across the Infinity Fabric to a second central hub and the memory that hangs off of it. And to simplify matters further, there are only two NUMA domains – one for each Rome complex. This should make Windows Server and Linux both run better on both single-socket and two-socket systems, and Clark said that Windows Server had a bit more trouble on Naples than did Linux the way the NUMA was implemented. The upshot of these changes to the NUMA architecture with Rome is that performance should be better and more even, and across a broader array of workloads to boot.

That I/O and memory controller hub chip also implements the PCI-Express 4.0 lanes that are used to lash peripherals to the system and in the case of two-socket servers, lash a pair of Rome compute complexes to each other.

As with the Naples chips, each Rome chip has 128 lanes of PCI-Express that is configurable in many different ways, as illustrated below:

As with Naples, half of the total PCI lanes are used to implement the NUMA links between two sockets, so both a single-socket and dual-socket Rome has only 128 PCI-Express lanes to serve peripherals. The ones in Rome have twice the bandwidth and can actually drive 100 Gb/sec and 200 Gb/sec adapters, which PCI-Express 3.0 has trouble doing with the former and cannot do with the latter in a normal x8 slot. The lanes can be used singly, are often ganged up in a pair (x2) for storage devices, potentially leaving room for maybe 56 NVM-Express drives plus a high speed network interface card in a Rome system.

Technically, the Naples chip had a single x1 lane separate from all of this for Infinity Fabric control. This x1 lane is also available for other traffic now that there is a central hub. So that means a single-socket Rome server technically has 129 lanes of PCI-Express 4.0 and a two-socket Rome server has 130 lanes. The Intel Xeons only scale down to x4 lanes; they can’t do x2 or x1 lanes, according to Clark. We had not heard this before.

Finally, the Zen 2 cores have some architectural extensions, which are outlined here and which are not being backcast into the Zen 1 cores of the Naples chips:

Next up, we will be taking a look at how AMD stacks up the Rome Epycs against its Xeon rivals and at Intel’s initial and then long-term response to the Rome chips.
18
World News / Re: Epstein's Island, Little St. James USVI Drone July 2019
Last post by FieryBlake -
You are not allowed to view links. Register or Login
You are not allowed to view links. Register or Login

You are not allowed to view links. Register or Login

Search "Jeffrey Epstein" in the news in case you are living under a rock on who this is...

You are not allowed to view links. Register or Login

This is an evil dude, but what's more interesting is who he is gonna take down with him. He is connected to hundreds of rich and famous elites.
He's dead now.
20
Bug Report / Re: Bug or Feature?
Last post by Alex Mars -
Okay... Have to adapt then :) Thanks for the fast answer!

On a related note: I'm finally fed up with firefox.  Is it enough to export/import the
Code: You are not allowed to view links. Register or Login
weboasis-settings.json
to keep my customization in the new browser? Any cookies I need to copy too?

Cheers, Alex