Messages from 161150

Article: 161150
Subject: Re: Altera Cyclone replacement
From: already5chosen@yahoo.com
Date: Wed, 6 Feb 2019 03:22:43 -0800 (PST)
Links: << >> << T >> << A >>

On Friday, January 25, 2019 at 5:16:04 PM UTC+2, Stef wrote:
> Hi,
> 
> We got an old design with an Altera Cyclone FPGA (EP1C12F324).
> These are probably obsolete (Can't find any info on them on the Intel
> site, Farnell is out of stock, etc.). Currently active are the Cyclone-IV
> and Cyclone-V if I understood correctly.
> 
> Is a design from a Cyclone portable to a Cyclone-IV/V? What kind of
> changes should I expect to code and board? Design includes NIOS.
> 
> Or alternatively, are their sources for these old Cyclone chips?
> (We actually would need 3 different types :-( )
> 
> 
> -- 
> Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail)
> 
> There is never time to do it right, but always time to do it over.

As you probably found out yourself, the least painful and the most cost effective migration path is to Cyclone 10LP. Despite the name 10, it is relatively old family (60 nm) that is less likely than new chips to have problems with 3/3.3V external I/O.
MAX 10 is very cheap at 2KLUTs. If your design is bigger than that then Cy10LP would be cheaper.

For relatively big volumes consider Lattice Mach. Their list price is no good, but volume discounts are fantastic. But be ready for much higher level of pain during development than what you probably accustomed to with Cyclone.

Article: 161151
Subject: Re: Altera Cyclone replacement
From: already5chosen@yahoo.com
Date: Wed, 6 Feb 2019 03:54:23 -0800 (PST)
Links: << >> << T >> << A >>

On Thursday, January 31, 2019 at 2:42:54 AM UTC+2, jim.bra...@ieee.org wrot=
e:
> On Wednesday, January 30, 2019 at 11:14:26 AM UTC-6, gnuarm.del...@gmail.=
com wrote:
> > On Wednesday, January 30, 2019 at 11:24:17 AM UTC-5, kkoorndyk wrote:
> > > On Tuesday, January 29, 2019 at 7:57:05 PM UTC-5, gnuarm.del...@gmail=
.com wrote:
> > > > On Monday, January 28, 2019 at 10:49:32 AM UTC-5, kkoorndyk wrote:
> > > > You may as well take the opportunity to "future proof" the design b=
y migrating to another vendor that isn't likely to get acquired or axed.  X=
ilinx has the single core Zynq-7000 devices if you want to go with a more m=
ain-stream, ARM processor sub-system (although likely overkill for whatever=
 your Nios is doing).  Otherwise, the Artix-7 and Spartan-7 would be good t=
argets if you want to migrate to a Microblaze or some other soft core.  The=
 Spartan-7 family is essentially the Artix-7 fabric with the transcievers r=
emoved and are offered in 6K to 100K logic cell densities.
> > > >=20
> > > > I don't think you actually got my point.  Moving to a Spartan by us=
ing a MicroBlaze processor isn't "future proofing" anything.  It is just sh=
ifting from one brand to another with the exact same problems. =20
> > > >=20
> > > > If you want to future proof a soft CPU design you need to drop any =
FPGA company in-house processor and use an open source processor design.  T=
hen you can use any FPGA you wish. =20
> > > >=20
> > > > Here is some info on the J1, an open source processor that was used=
 to replace a microblaze when it became unequal to the task at hand. =20
> > > >=20
> > > > http://www.forth.org/svfig/kk/11-2010-Bowman.pdf
> > > >=20
> > > > http://www.excamera.com/sphinx/fpga-j1.html
> > > >=20
> > > > http://www.excamera.com/files/j1.pdf
> > > >=20
> > > >=20
> > > >   Rick C.
> > > >=20
> > > >   -- Get 6 months of free supercharging
> > > >   -- Tesla referral code - https://ts.la/richard11209
> > >=20
> > > No, I got your point perfectly, hence the following part of my recomm=
endation:  "or some other soft core."
> >=20
> > I am making the point that porting from one proprietary processor to an=
other is of limited value.  Microblaze is proprietary.  I believe there may=
 be some open source versions available, but I expect there are open source=
 versions of the NIOS available as well.  But perhaps more importantly, the=
y are far from optimal.  That's why I posted the info on the J1 processor. =
 It was invented to replace a Microblaze that wasn't up to the task. =20
> >=20
> >=20
> > > If the original Nios was employed, I'm not entirely convinced a soft =
core is necessary (yet).  How simple is the software running on it?  Can it=
 reasonably be ported to HDL, thus ensuring portability?  I tend to lean th=
at way unless the SW was simple due to capability limitations in the earlie=
r technologies (e.g., old Cyclone and Nios) and the desire is to add more f=
eatures that are realizable with new generation devices and soft (or hard) =
core capabilities.
> >=20
> > Sometimes soft CPUs are added to reduce the size of logic.  Other times=
 they are added because of the complexity of expression.  Regardless of how=
 simply we can write HDL, the large part of the engineering world perceives=
 HDL as much more complex than other languages and are not willing to port =
code to an HDL unless absolutely required.  So if the code is currently in =
C, it won't get ported to HDL without a compelling reason.=20
> >=20
> > Personally I think Xilinx and Altera are responsible for the present pe=
rception that FPGAs are difficult to use, expensive, large and power hungry=
.  That is largely true if you use their products only.  Lattice has been a=
ddressing a newer market with small, low power, inexpensive devices intende=
d for the mobile market.  Now if someone would approach the issue of ease o=
f use by something more than throwing an IDE on top of their command line t=
ools, the FPGA market can explode into territory presently dominated by MCU=
s. =20
> >=20
> > Does anyone really think toasters can only be controlled by MCUs?  We j=
ust need a cheap enough FPGA in a suitable package. =20
> >=20
> >=20
> >   Rick C.
> >=20
> >   +- Get 6 months of free supercharging
> >   +- Tesla referral code - https://ts.la/richard11209
>=20
> ]>Microblaze is proprietary.  I believe there may be some open source ver=
sions available, but I expect there are open source versions of the NIOS av=
ailable as well.
>=20
> Microblaze clones: aeMB, an-noc-mpsoc, mblite, mb-lite-plus, myblaze, ope=
nfire_core, openfire2, secretblaze
>=20
> No NIOS clones that I know of
>=20

I am playing with one right now.=20
Already have half-dozen working variants each with its own advantage/disadv=
antange in terms of resources usage (LEs vs M9K) and Fmax. The smallest one=
 is still not as small as Altera's Nios2e and the fastest one is still not =
as fast as Altera's Nios2f. Beating Nios2e on size is in my [near] future p=
lans, beating Altera's Nios2f on speed and features is of lesser priority.

My cores are less full-featured than even nios2e. They are intended for one=
 certain niche that I would call "soft MCU". In particular, the only suppor=
ted  program memory is what Altera calls "tightly coupled memory", i.e. emb=
edded dual-ported SRAM blocks with no other master connected. Another limit=
ations are absence of exceptions and external interrupts. For me it's o.k. =
that's how I use nios2e anyway.

I didn't check if what I am doing is legal.
Probably does not matter as long as it's just a repo on github.


> ]>But perhaps more importantly, they are far from optimal.
> Ugh, they have some of the best figure-of-merit numbers available.
>   (Instructions per second per LUT)
> And are available in many configuration options.
>=20
> There are a large variety of RISC-V cores available some of which have lo=
w LUT counts.

Fixed-instruction-width 32-bit subset of RISC-V ISA is nearly identical to =
Nios2 down to the level of instruction formats. The biggest difference is 1=
2-bit immediate in RV vs 16-bit in N2. Not a big deal.
So I expect that RV32 cores available in source form can be modified to run=
 Nios2 in few days (or, if original designer is involved, in few hours).

The bigger difference would be external interface. In N2 one expects Avalon=
-mm. I have no idea what's a standard bus/fabric in the world of RV soft co=
res and how similar it is to AVM.


>=20
> Jim Brakefield

Article: 161152
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: Thomas Heller <theller@ctypes.org>
Date: Thu, 7 Feb 2019 10:25:37 +0100
Links: << >> << T >> << A >>

Am 04.02.2019 um 10:20 schrieb Swapnil Patil:
> On Monday, February 4, 2019 at 11:59:45 AM UTC+5:30, Swapnil Patil
> wrote:
>> Hello folks,
>> 
>> Let's say I have Spartan 6 board only and i wanted to implement
>> Ethernet communication.So how can it be done?
>> 
>> I don't want to connect any Hard or Soft core processor. also I
>> have looked into WIZnet W5300 Ethernet controller interfacing to
>> spartan 6, but I don't want to connect any such controller just
>> spartan 6. So how can it be done?
>> 
>> It is not necessary to use spartan 6 board only.If it possible to
>> workout with any another boards I would really like to know.
>> Thanks
> 
> 
> Thanks for replies. I understand it's not easy to implement still i
> want to give a try. If you have any links or document of work done
> related to this please share. Rick C. could you tell more how one
> should start to implement this with cores? I also wanted to know more
> about these written cores. Hans is it possible we can get information
> about work that companies made you know about? Thanks.
> 

You might want to read this:

https://www.fpga4fun.com/10BASE-T.html

Thomas

Article: 161153
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: already5chosen@yahoo.com
Date: Thu, 7 Feb 2019 02:07:20 -0800 (PST)
Links: << >> << T >> << A >>

On Tuesday, February 5, 2019 at 12:12:47 PM UTC+2, David Brown wrote:
> On 04/02/2019 21:55, gnuarm.deletethisbit@gmail.com wrote:
>=20
> > I don't know a lot about TCP/IP, but I've been told you can implement i=
t to many different degrees depending on your requirements.  I think it had=
 to do with the fact that some aspects are specified rather vaguely, timeou=
ts and who manages the retries, etc.  I assume this was not as full an impl=
ementation as you might have on a PC.  So I wonder if this is an apples to =
oranges comparison. =20
> >=20
>=20
> That is correct - there are lots of things in IP networking in general,
> and TCP/IP on top of that, which can be simplified, limited, or handled
> statically.  For example, TCP/IP has window size control so that each
> end can automatically adjust if there is a part of the network that has
> a small MTU (packet size) - that way there will be less fragmentation,
> and greater throughput.  That is an issue if you have dial-up modems and
> similar links - if you have a more modern network, you could simply
> assume a larger window size and leave it fixed.  There are a good many
> such parts of the stack that can be simplified.
>=20
>=20
>=20
> > Are there any companies selling TCP/IP that they actually list on their=
 web site?=20
> >

TCP window size and MTU are orthogonal concepts.
Judged by this post, I'd suspect that you know more about TCP that Rick C, =
but less than Rick H which sounds like the only one of 3 of you that had hi=
s own hands dirty in attempt to implement it.

Article: 161154
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: already5chosen@yahoo.com
Date: Thu, 7 Feb 2019 02:23:44 -0800 (PST)
Links: << >> << T >> << A >>

On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:
>
> Back in the late 80s there was the perception that TCP was
> slow, and hence new transport protocols were developed to
> mitigate that, e.g. XTP.
>=20
> In reality, it wasn't TCP per se that was slow. Rather
> the implementation, particularly multiple copies of data
> as the packet went up the stack, and between network
> processor / main processor and between kernel and user
> space.

TCP per se *is* slow when frame error rate of underlying layers is not near=
 zero.

Also, there exist cases of "interesting" interactions between Nagle algorit=
hm at transmitter and ACK saving algorithm at receiver that can lead to slo=
wness of certain styles of TCP conversions (Send mid-size block of data, wa=
it for application-level acknowledge, send next mid-size block) that is typ=
ically resolved by not following the language of RFCs too literally.

Article: 161155
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: David Brown <david.brown@hesbynett.no>
Date: Thu, 7 Feb 2019 13:19:53 +0100
Links: << >> << T >> << A >>

On 07/02/2019 11:07, already5chosen@yahoo.com wrote:
> On Tuesday, February 5, 2019 at 12:12:47 PM UTC+2, David Brown wrote:
>> On 04/02/2019 21:55, gnuarm.deletethisbit@gmail.com wrote:
>>
>>> I don't know a lot about TCP/IP, but I've been told you can implement it to many different degrees depending on your requirements.  I think it had to do with the fact that some aspects are specified rather vaguely, timeouts and who manages the retries, etc.  I assume this was not as full an implementation as you might have on a PC.  So I wonder if this is an apples to oranges comparison.  
>>>
>>
>> That is correct - there are lots of things in IP networking in general,
>> and TCP/IP on top of that, which can be simplified, limited, or handled
>> statically.  For example, TCP/IP has window size control so that each
>> end can automatically adjust if there is a part of the network that has
>> a small MTU (packet size) - that way there will be less fragmentation,
>> and greater throughput.  That is an issue if you have dial-up modems and
>> similar links - if you have a more modern network, you could simply
>> assume a larger window size and leave it fixed.  There are a good many
>> such parts of the stack that can be simplified.
>>
>>
>>
>>> Are there any companies selling TCP/IP that they actually list on their web site? 
>>>
> 
> TCP window size and MTU are orthogonal concepts.
> Judged by this post, I'd suspect that you know more about TCP that Rick C, but less than Rick H which sounds like the only one of 3 of you that had his own hands dirty in attempt to implement it.
> 

They are different concepts, yes, window size can be reduced to below
MTU size on small systems to ensure that you don't get fragmentation,
and you don't need to resend more than one low-level packet.  But it is
not a level of detail that I have needed to work at, so I have no
personal experience of that.

Article: 161156
Subject: Re: Altera Cyclone replacement
From: gnuarm.deletethisbit@gmail.com
Date: Thu, 7 Feb 2019 10:36:31 -0800 (PST)
Links: << >> << T >> << A >>

On Wednesday, February 6, 2019 at 6:54:27 AM UTC-5, already...@yahoo.com wr=
ote:
> On Thursday, January 31, 2019 at 2:42:54 AM UTC+2, jim.bra...@ieee.org wr=
ote:
> > On Wednesday, January 30, 2019 at 11:14:26 AM UTC-6, gnuarm.del...@gmai=
l.com wrote:
> > > On Wednesday, January 30, 2019 at 11:24:17 AM UTC-5, kkoorndyk wrote:
> > > > On Tuesday, January 29, 2019 at 7:57:05 PM UTC-5, gnuarm.del...@gma=
il.com wrote:
> > > > > On Monday, January 28, 2019 at 10:49:32 AM UTC-5, kkoorndyk wrote=
:
> > > > > You may as well take the opportunity to "future proof" the design=
 by migrating to another vendor that isn't likely to get acquired or axed. =
 Xilinx has the single core Zynq-7000 devices if you want to go with a more=
 main-stream, ARM processor sub-system (although likely overkill for whatev=
er your Nios is doing).  Otherwise, the Artix-7 and Spartan-7 would be good=
 targets if you want to migrate to a Microblaze or some other soft core.  T=
he Spartan-7 family is essentially the Artix-7 fabric with the transcievers=
 removed and are offered in 6K to 100K logic cell densities.
> > > > >=20
> > > > > I don't think you actually got my point.  Moving to a Spartan by =
using a MicroBlaze processor isn't "future proofing" anything.  It is just =
shifting from one brand to another with the exact same problems. =20
> > > > >=20
> > > > > If you want to future proof a soft CPU design you need to drop an=
y FPGA company in-house processor and use an open source processor design. =
 Then you can use any FPGA you wish. =20
> > > > >=20
> > > > > Here is some info on the J1, an open source processor that was us=
ed to replace a microblaze when it became unequal to the task at hand. =20
> > > > >=20
> > > > > http://www.forth.org/svfig/kk/11-2010-Bowman.pdf
> > > > >=20
> > > > > http://www.excamera.com/sphinx/fpga-j1.html
> > > > >=20
> > > > > http://www.excamera.com/files/j1.pdf
> > > > >=20
> > > > >=20
> > > > >   Rick C.
> > > > >=20
> > > > >   -- Get 6 months of free supercharging
> > > > >   -- Tesla referral code - https://ts.la/richard11209
> > > >=20
> > > > No, I got your point perfectly, hence the following part of my reco=
mmendation:  "or some other soft core."
> > >=20
> > > I am making the point that porting from one proprietary processor to =
another is of limited value.  Microblaze is proprietary.  I believe there m=
ay be some open source versions available, but I expect there are open sour=
ce versions of the NIOS available as well.  But perhaps more importantly, t=
hey are far from optimal.  That's why I posted the info on the J1 processor=
.  It was invented to replace a Microblaze that wasn't up to the task. =20
> > >=20
> > >=20
> > > > If the original Nios was employed, I'm not entirely convinced a sof=
t core is necessary (yet).  How simple is the software running on it?  Can =
it reasonably be ported to HDL, thus ensuring portability?  I tend to lean =
that way unless the SW was simple due to capability limitations in the earl=
ier technologies (e.g., old Cyclone and Nios) and the desire is to add more=
 features that are realizable with new generation devices and soft (or hard=
) core capabilities.
> > >=20
> > > Sometimes soft CPUs are added to reduce the size of logic.  Other tim=
es they are added because of the complexity of expression.  Regardless of h=
ow simply we can write HDL, the large part of the engineering world perceiv=
es HDL as much more complex than other languages and are not willing to por=
t code to an HDL unless absolutely required.  So if the code is currently i=
n C, it won't get ported to HDL without a compelling reason.=20
> > >=20
> > > Personally I think Xilinx and Altera are responsible for the present =
perception that FPGAs are difficult to use, expensive, large and power hung=
ry.  That is largely true if you use their products only.  Lattice has been=
 addressing a newer market with small, low power, inexpensive devices inten=
ded for the mobile market.  Now if someone would approach the issue of ease=
 of use by something more than throwing an IDE on top of their command line=
 tools, the FPGA market can explode into territory presently dominated by M=
CUs. =20
> > >=20
> > > Does anyone really think toasters can only be controlled by MCUs?  We=
 just need a cheap enough FPGA in a suitable package. =20
> > >=20
> > >=20
> > >   Rick C.
> > >=20
> > >   +- Get 6 months of free supercharging
> > >   +- Tesla referral code - https://ts.la/richard11209
> >=20
> > ]>Microblaze is proprietary.  I believe there may be some open source v=
ersions available, but I expect there are open source versions of the NIOS =
available as well.
> >=20
> > Microblaze clones: aeMB, an-noc-mpsoc, mblite, mb-lite-plus, myblaze, o=
penfire_core, openfire2, secretblaze
> >=20
> > No NIOS clones that I know of
> >=20
>=20
> I am playing with one right now.=20
> Already have half-dozen working variants each with its own advantage/disa=
dvantange in terms of resources usage (LEs vs M9K) and Fmax. The smallest o=
ne is still not as small as Altera's Nios2e and the fastest one is still no=
t as fast as Altera's Nios2f. Beating Nios2e on size is in my [near] future=
 plans, beating Altera's Nios2f on speed and features is of lesser priority=
.
>=20
> My cores are less full-featured than even nios2e. They are intended for o=
ne certain niche that I would call "soft MCU". In particular, the only supp=
orted  program memory is what Altera calls "tightly coupled memory", i.e. e=
mbedded dual-ported SRAM blocks with no other master connected. Another lim=
itations are absence of exceptions and external interrupts. For me it's o.k=
. that's how I use nios2e anyway.
>=20
> I didn't check if what I am doing is legal.
> Probably does not matter as long as it's just a repo on github.
>=20
>=20
> > ]>But perhaps more importantly, they are far from optimal.
> > Ugh, they have some of the best figure-of-merit numbers available.
> >   (Instructions per second per LUT)
> > And are available in many configuration options.
> >=20
> > There are a large variety of RISC-V cores available some of which have =
low LUT counts.
>=20
> Fixed-instruction-width 32-bit subset of RISC-V ISA is nearly identical t=
o Nios2 down to the level of instruction formats. The biggest difference is=
 12-bit immediate in RV vs 16-bit in N2. Not a big deal.
> So I expect that RV32 cores available in source form can be modified to r=
un Nios2 in few days (or, if original designer is involved, in few hours).
>=20
> The bigger difference would be external interface. In N2 one expects Aval=
on-mm. I have no idea what's a standard bus/fabric in the world of RV soft =
cores and how similar it is to AVM.

Should I assume you are not using C to program these CPUs?=20

If that is correct, have you considered a stack based CPU?  When you refer =
to CPUs like the RISC-V I'm thinking they use thousands of LUT4s.  Many sta=
ck based CPUs can be implemented in 1k LUT4s or less.  They can run fast, >=
100 MHz and typically are not pipelined. =20

There is a lot of interest in stack CPUs in the Forth community since typic=
ally their assembly language is similar to the Forth virtual machine. =20

I'm not familiar with Avalon and I don't know what N2 is.  A popular bus in=
 the FPGA embedded world is Wishbone. =20



  Rick C.

  --+ Tesla referral code - https://ts.la/richard11209

Article: 161157
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: Tom Gardner <spamjunk@blueyonder.co.uk>
Date: Thu, 7 Feb 2019 20:04:04 +0000
Links: << >> << T >> << A >>

On 07/02/19 10:23, already5chosen@yahoo.com wrote:
> On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:
>> 
>> Back in the late 80s there was the perception that TCP was slow, and hence
>> new transport protocols were developed to mitigate that, e.g. XTP.
>> 
>> In reality, it wasn't TCP per se that was slow. Rather the implementation,
>> particularly multiple copies of data as the packet went up the stack, and
>> between network processor / main processor and between kernel and user 
>> space.
> 
> TCP per se *is* slow when frame error rate of underlying layers is not near
> zero.

That's a problem with any transport protocol.

The solution to underlying frame errors is FEC, but that
reduces the bandwidth when there are no errors. Choose
what you optimise for!


> Also, there exist cases of "interesting" interactions between Nagle algorithm
> at transmitter and ACK saving algorithm at receiver that can lead to slowness
> of certain styles of TCP conversions (Send mid-size block of data, wait for
> application-level acknowledge, send next mid-size block) that is typically
> resolved by not following the language of RFCs too literally.

That sounds like a "corner case". I'd be surprised
if you couldn't find corner cases in all transport
protocols.

Article: 161158
Subject: Re: Altera Cyclone replacement
From: already5chosen@yahoo.com
Date: Thu, 7 Feb 2019 13:00:52 -0800 (PST)
Links: << >> << T >> << A >>

On Thursday, February 7, 2019 at 8:36:36 PM UTC+2, gnuarm.del...@gmail.com =
wrote:
> On Wednesday, February 6, 2019 at 6:54:27 AM UTC-5, already...@yahoo.com =
wrote:
> > On Thursday, January 31, 2019 at 2:42:54 AM UTC+2, jim.bra...@ieee.org =
wrote:
> > > On Wednesday, January 30, 2019 at 11:14:26 AM UTC-6, gnuarm.del...@gm=
ail.com wrote:
> > > > On Wednesday, January 30, 2019 at 11:24:17 AM UTC-5, kkoorndyk wrot=
e:
> > > > > On Tuesday, January 29, 2019 at 7:57:05 PM UTC-5, gnuarm.del...@g=
mail.com wrote:
> > > > > > On Monday, January 28, 2019 at 10:49:32 AM UTC-5, kkoorndyk wro=
te:
> > > > > > You may as well take the opportunity to "future proof" the desi=
gn by migrating to another vendor that isn't likely to get acquired or axed=
.  Xilinx has the single core Zynq-7000 devices if you want to go with a mo=
re main-stream, ARM processor sub-system (although likely overkill for what=
ever your Nios is doing).  Otherwise, the Artix-7 and Spartan-7 would be go=
od targets if you want to migrate to a Microblaze or some other soft core. =
 The Spartan-7 family is essentially the Artix-7 fabric with the transcieve=
rs removed and are offered in 6K to 100K logic cell densities.
> > > > > >=20
> > > > > > I don't think you actually got my point.  Moving to a Spartan b=
y using a MicroBlaze processor isn't "future proofing" anything.  It is jus=
t shifting from one brand to another with the exact same problems. =20
> > > > > >=20
> > > > > > If you want to future proof a soft CPU design you need to drop =
any FPGA company in-house processor and use an open source processor design=
.  Then you can use any FPGA you wish. =20
> > > > > >=20
> > > > > > Here is some info on the J1, an open source processor that was =
used to replace a microblaze when it became unequal to the task at hand. =
=20
> > > > > >=20
> > > > > > http://www.forth.org/svfig/kk/11-2010-Bowman.pdf
> > > > > >=20
> > > > > > http://www.excamera.com/sphinx/fpga-j1.html
> > > > > >=20
> > > > > > http://www.excamera.com/files/j1.pdf
> > > > > >=20
> > > > > >=20
> > > > > >   Rick C.
> > > > > >=20
> > > > > >   -- Get 6 months of free supercharging
> > > > > >   -- Tesla referral code - https://ts.la/richard11209
> > > > >=20
> > > > > No, I got your point perfectly, hence the following part of my re=
commendation:  "or some other soft core."
> > > >=20
> > > > I am making the point that porting from one proprietary processor t=
o another is of limited value.  Microblaze is proprietary.  I believe there=
 may be some open source versions available, but I expect there are open so=
urce versions of the NIOS available as well.  But perhaps more importantly,=
 they are far from optimal.  That's why I posted the info on the J1 process=
or.  It was invented to replace a Microblaze that wasn't up to the task. =
=20
> > > >=20
> > > >=20
> > > > > If the original Nios was employed, I'm not entirely convinced a s=
oft core is necessary (yet).  How simple is the software running on it?  Ca=
n it reasonably be ported to HDL, thus ensuring portability?  I tend to lea=
n that way unless the SW was simple due to capability limitations in the ea=
rlier technologies (e.g., old Cyclone and Nios) and the desire is to add mo=
re features that are realizable with new generation devices and soft (or ha=
rd) core capabilities.
> > > >=20
> > > > Sometimes soft CPUs are added to reduce the size of logic.  Other t=
imes they are added because of the complexity of expression.  Regardless of=
 how simply we can write HDL, the large part of the engineering world perce=
ives HDL as much more complex than other languages and are not willing to p=
ort code to an HDL unless absolutely required.  So if the code is currently=
 in C, it won't get ported to HDL without a compelling reason.=20
> > > >=20
> > > > Personally I think Xilinx and Altera are responsible for the presen=
t perception that FPGAs are difficult to use, expensive, large and power hu=
ngry.  That is largely true if you use their products only.  Lattice has be=
en addressing a newer market with small, low power, inexpensive devices int=
ended for the mobile market.  Now if someone would approach the issue of ea=
se of use by something more than throwing an IDE on top of their command li=
ne tools, the FPGA market can explode into territory presently dominated by=
 MCUs. =20
> > > >=20
> > > > Does anyone really think toasters can only be controlled by MCUs?  =
We just need a cheap enough FPGA in a suitable package. =20
> > > >=20
> > > >=20
> > > >   Rick C.
> > > >=20
> > > >   +- Get 6 months of free supercharging
> > > >   +- Tesla referral code - https://ts.la/richard11209
> > >=20
> > > ]>Microblaze is proprietary.  I believe there may be some open source=
 versions available, but I expect there are open source versions of the NIO=
S available as well.
> > >=20
> > > Microblaze clones: aeMB, an-noc-mpsoc, mblite, mb-lite-plus, myblaze,=
 openfire_core, openfire2, secretblaze
> > >=20
> > > No NIOS clones that I know of
> > >=20
> >=20
> > I am playing with one right now.=20
> > Already have half-dozen working variants each with its own advantage/di=
sadvantange in terms of resources usage (LEs vs M9K) and Fmax. The smallest=
 one is still not as small as Altera's Nios2e and the fastest one is still =
not as fast as Altera's Nios2f. Beating Nios2e on size is in my [near] futu=
re plans, beating Altera's Nios2f on speed and features is of lesser priori=
ty.
> >=20
> > My cores are less full-featured than even nios2e. They are intended for=
 one certain niche that I would call "soft MCU". In particular, the only su=
pported  program memory is what Altera calls "tightly coupled memory", i.e.=
 embedded dual-ported SRAM blocks with no other master connected. Another l=
imitations are absence of exceptions and external interrupts. For me it's o=
.k. that's how I use nios2e anyway.
> >=20
> > I didn't check if what I am doing is legal.
> > Probably does not matter as long as it's just a repo on github.
> >=20
> >=20
> > > ]>But perhaps more importantly, they are far from optimal.
> > > Ugh, they have some of the best figure-of-merit numbers available.
> > >   (Instructions per second per LUT)
> > > And are available in many configuration options.
> > >=20
> > > There are a large variety of RISC-V cores available some of which hav=
e low LUT counts.
> >=20
> > Fixed-instruction-width 32-bit subset of RISC-V ISA is nearly identical=
 to Nios2 down to the level of instruction formats. The biggest difference =
is 12-bit immediate in RV vs 16-bit in N2. Not a big deal.
> > So I expect that RV32 cores available in source form can be modified to=
 run Nios2 in few days (or, if original designer is involved, in few hours)=
.
> >=20
> > The bigger difference would be external interface. In N2 one expects Av=
alon-mm. I have no idea what's a standard bus/fabric in the world of RV sof=
t cores and how similar it is to AVM.
>=20
> Should I assume you are not using C to program these CPUs?=20
>=20

That would be a wrong assumption.
An exact opposite is far closer to reality - I pretty much never use anythi=
ng, but C to program these CPUs.

> If that is correct, have you considered a stack based CPU?  When you refe=
r to CPUs like the RISC-V I'm thinking they use thousands of LUT4s.

It depends on performance, one is looking for.=20
2-2.5 KLUT4s (+few embedded memory blocks and multipliers) is a size of ful=
ly pipelined single-issue CPU with direct-mapped instruction and data cache=
s, multiplier and divider that runs at very decent Fmax, but features no MM=
Us or MPU.
On the other end of the spectrum you find winners of RISC-V core size compe=
tition - under 400 LUTs, but (I would guess, didn't check it), glacially sl=
ow in terms of CPI. But Fmax is still decent.

Half-dozen Nios2 cores of mine is in the middle - 700 to 850 LUT4s, CPI ran=
ging from (approximately) 2.1 to 4.7 and Fmax ranging from reasonable to im=
practically high.=20
But my main goal was (is) a learning experience rather than practicality. I=
n particular, for majority of variants I set to myself impractical constrai=
n of implementing register file in a single embedded memory block. Doing it=
 in two blocks is far more practical, but less challenging. The same goes w=
ith aiming to very high Fmax- not practical, but fun.
May be, after I explore another half-dozen of dozen of fun possibilities I =
will settle on building the most practical solutions. But not less probable=
 that I'll lose interest and/or focus before that. I am not too passionate =
about the whole thing.


>  Many stack based CPUs can be implemented in 1k LUT4s or less.  They can =
run fast, >100 MHz and typically are not pipelined. =20
>=20
> There is a lot of interest in stack CPUs in the Forth community since typ=
ically their assembly language is similar to the Forth virtual machine. =20
>=20
> I'm not familiar with Avalon and I don't know what N2 is.=20

N2 is my shortcut for Nios2.

> A popular bus in the FPGA embedded world is Wishbone. =20
>=20

I payed attention that Wishbone is popular in Lattice cycles. But Altera wo=
rld is many times bigger than Lattice and here Avalon is a king. Also,  whe=
n performance matters, Aavalon is much better technically.


>=20
>=20
>   Rick C.
>=20
>   --+ Tesla referral code - https://ts.la/richard11209

Article: 161159
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: already5chosen@yahoo.com
Date: Thu, 7 Feb 2019 13:17:35 -0800 (PST)
Links: << >> << T >> << A >>

On Thursday, February 7, 2019 at 10:04:09 PM UTC+2, Tom Gardner wrote:
> On 07/02/19 10:23, already5chosen@yahoo.com wrote:
> > On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:
> >> 
> >> Back in the late 80s there was the perception that TCP was slow, and hence
> >> new transport protocols were developed to mitigate that, e.g. XTP.
> >> 
> >> In reality, it wasn't TCP per se that was slow. Rather the implementation,
> >> particularly multiple copies of data as the packet went up the stack, and
> >> between network processor / main processor and between kernel and user 
> >> space.
> > 
> > TCP per se *is* slow when frame error rate of underlying layers is not near
> > zero.
> 
> That's a problem with any transport protocol.
> 

TCP is worse than most.
Partly because it's jack of all trades in terms of latency and bandwidth.
Partly, because it's stream (rather than datagram) oriented, which makes recovery, based on selective retransmission far more complicated=less practical.

> The solution to underlying frame errors is FEC, but that
> reduces the bandwidth when there are no errors. Choose
> what you optimise for!
> 
> 
> > Also, there exist cases of "interesting" interactions between Nagle algorithm
> > at transmitter and ACK saving algorithm at receiver that can lead to slowness
> > of certain styles of TCP conversions (Send mid-size block of data, wait for
> > application-level acknowledge, send next mid-size block) that is typically
> > resolved by not following the language of RFCs too literally.
> 
> That sounds like a "corner case". I'd be surprised
> if you couldn't find corner cases in all transport
> protocols.

Sure. But not a rare corner case. And again, far less likely to happen to datagram-oriented reliable transports.

Article: 161160
Subject: Re: Altera Cyclone replacement
From: gnuarm.deletethisbit@gmail.com
Date: Thu, 7 Feb 2019 13:43:35 -0800 (PST)
Links: << >> << T >> << A >>

On Thursday, February 7, 2019 at 4:00:57 PM UTC-5, already...@yahoo.com wro=
te:
> On Thursday, February 7, 2019 at 8:36:36 PM UTC+2, gnuarm.del...@gmail.co=
m wrote:
> >=20
> > Should I assume you are not using C to program these CPUs?=20
> >=20
>=20
> That would be a wrong assumption.
> An exact opposite is far closer to reality - I pretty much never use anyt=
hing, but C to program these CPUs.
>=20
> > If that is correct, have you considered a stack based CPU?  When you re=
fer to CPUs like the RISC-V I'm thinking they use thousands of LUT4s.
>=20
> It depends on performance, one is looking for.=20
> 2-2.5 KLUT4s (+few embedded memory blocks and multipliers) is a size of f=
ully pipelined single-issue CPU with direct-mapped instruction and data cac=
hes, multiplier and divider that runs at very decent Fmax, but features no =
MMUs or MPU.
> On the other end of the spectrum you find winners of RISC-V core size com=
petition - under 400 LUTs, but (I would guess, didn't check it), glacially =
slow in terms of CPI. But Fmax is still decent.
>=20
> Half-dozen Nios2 cores of mine is in the middle - 700 to 850 LUT4s, CPI r=
anging from (approximately) 2.1 to 4.7 and Fmax ranging from reasonable to =
impractically high.=20
> But my main goal was (is) a learning experience rather than practicality.=
 In particular, for majority of variants I set to myself impractical constr=
ain of implementing register file in a single embedded memory block. Doing =
it in two blocks is far more practical, but less challenging. The same goes=
 with aiming to very high Fmax- not practical, but fun.
> May be, after I explore another half-dozen of dozen of fun possibilities =
I will settle on building the most practical solutions. But not less probab=
le that I'll lose interest and/or focus before that. I am not too passionat=
e about the whole thing.

Ok, if you are doing C in FPGA CPUs then you are in a different world than =
the stuff I've worked on.  My projects use a CPU as a controller and often =
have very critical real time requirements.  While C doesn't prevent that, I=
 prefer to just code in assembly language and more importantly, use a CPU d=
esign that provides single cycle execution of all instructions.  That's why=
 I like stack processors, they are easy to design, use a very simple instru=
ction set and the assembly language can be very close to the Forth high lev=
el language.=20


> >  Many stack based CPUs can be implemented in 1k LUT4s or less.  They ca=
n run fast, >100 MHz and typically are not pipelined. =20
> >=20
> > There is a lot of interest in stack CPUs in the Forth community since t=
ypically their assembly language is similar to the Forth virtual machine. =
=20
> >=20
> > I'm not familiar with Avalon and I don't know what N2 is.=20
>=20
> N2 is my shortcut for Nios2.
>=20
> > A popular bus in the FPGA embedded world is Wishbone. =20
> >=20
>=20
> I payed attention that Wishbone is popular in Lattice cycles. But Altera =
world is many times bigger than Lattice and here Avalon is a king. Also,  w=
hen performance matters, Aavalon is much better technically.

I'm not familiar with what bus is preferred where.  I just know that every =
project I've looked at on OpenCores using a standard bus used Wishbone.  If=
 you say Avalon is better, ok.  Is it open source?  Can it be used on other=
 than Intel products?=20


  Rick C.

  -+- Tesla referral code - https://ts.la/richard11209

Article: 161161
Subject: Re: Altera Cyclone replacement
From: already5chosen@yahoo.com
Date: Thu, 7 Feb 2019 14:11:15 -0800 (PST)
Links: << >> << T >> << A >>

On Thursday, February 7, 2019 at 11:43:39 PM UTC+2, gnuarm.del...@gmail.com wrote:
> On Thursday, February 7, 2019 at 4:00:57 PM UTC-5, already...@yahoo.com wrote:
> > On Thursday, February 7, 2019 at 8:36:36 PM UTC+2, gnuarm.del...@gmail.com wrote:

> > 
> > > A popular bus in the FPGA embedded world is Wishbone.  
> > > 
> > 
> > I payed attention that Wishbone is popular in Lattice cycles. But Altera world is many times bigger than Lattice and here Avalon is a king. Also,  when performance matters, Aavalon is much better technically.
> 
> I'm not familiar with what bus is preferred where.  I just know that every project I've looked at on OpenCores using a standard bus used Wishbone.  If you say Avalon is better, ok.  Is it open source?  Can it be used on other than Intel products? 
> 

i am not sure what 'open source" means in this context.
Avalon-MM and Avalon-ST are specifications. the documents. The documents freely downloadable from Altera/Intel web site.

https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/manual/mnl_avalon_spec.pdf

A GUI tool, that connects together components, conforming to Avalon specs, which was called SOPC is 00s, then QSYS and now Intel system designer, or something like that, is proprietary close source program.
The code that a tool generates is a normal VHDL or, more often, normal Verilog, that contains copyright statement like that:
// -----------------------------------------------------------
// Legal Notice: (C)2007 Altera Corporation. All rights reserved.  Your
// use of Altera Corporation's design tools, logic functions and other
// software and tools, and its AMPP partner logic functions, and any
// output files any of the foregoing (including device programming or
// simulation files), and any associated documentation or information are
// expressly subject to the terms and conditions of the Altera Program
// License Subscription Agreement or other applicable license agreement,
// including, without limitation, that your use is for the sole purpose
// of programming logic devices manufactured by Altera and sold by Altera
// or its authorized distributors.  Please refer to the applicable
// agreement for further details.

So, you can't legally use QSYS-generated code with non-Intel devices.
But (IANAL) nobody prevents you from writing your own interconnect generation tool. Or from not using any CAD tool at all and just connecting components manually within your HDL. Isn't it mostly what you do with Wishbone components, anyway?

> 
>   Rick C.
> 
>   -+- Tesla referral code - https://ts.la/richard11209

Article: 161162
Subject: Re: Altera Cyclone replacement
From: gnuarm.deletethisbit@gmail.com
Date: Thu, 7 Feb 2019 16:22:56 -0800 (PST)
Links: << >> << T >> << A >>

On Thursday, February 7, 2019 at 5:11:20 PM UTC-5, already...@yahoo.com wro=
te:
> On Thursday, February 7, 2019 at 11:43:39 PM UTC+2, gnuarm.del...@gmail.c=
om wrote:
> > On Thursday, February 7, 2019 at 4:00:57 PM UTC-5, already...@yahoo.com=
 wrote:
> > > On Thursday, February 7, 2019 at 8:36:36 PM UTC+2, gnuarm.del...@gmai=
l.com wrote:
>=20
> > >=20
> > > > A popular bus in the FPGA embedded world is Wishbone. =20
> > > >=20
> > >=20
> > > I payed attention that Wishbone is popular in Lattice cycles. But Alt=
era world is many times bigger than Lattice and here Avalon is a king. Also=
,  when performance matters, Aavalon is much better technically.
> >=20
> > I'm not familiar with what bus is preferred where.  I just know that ev=
ery project I've looked at on OpenCores using a standard bus used Wishbone.=
  If you say Avalon is better, ok.  Is it open source?  Can it be used on o=
ther than Intel products?=20
> >=20
>=20
> i am not sure what 'open source" means in this context.
> Avalon-MM and Avalon-ST are specifications. the documents. The documents =
freely downloadable from Altera/Intel web site.
>=20
> https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/=
manual/mnl_avalon_spec.pdf
>=20
> A GUI tool, that connects together components, conforming to Avalon specs=
, which was called SOPC is 00s, then QSYS and now Intel system designer, or=
 something like that, is proprietary close source program.
> The code that a tool generates is a normal VHDL or, more often, normal Ve=
rilog, that contains copyright statement like that:
> // -----------------------------------------------------------
> // Legal Notice: (C)2007 Altera Corporation. All rights reserved.  Your
> // use of Altera Corporation's design tools, logic functions and other
> // software and tools, and its AMPP partner logic functions, and any
> // output files any of the foregoing (including device programming or
> // simulation files), and any associated documentation or information are
> // expressly subject to the terms and conditions of the Altera Program
> // License Subscription Agreement or other applicable license agreement,
> // including, without limitation, that your use is for the sole purpose
> // of programming logic devices manufactured by Altera and sold by Altera
> // or its authorized distributors.  Please refer to the applicable
> // agreement for further details.
>=20
> So, you can't legally use QSYS-generated code with non-Intel devices.
> But (IANAL) nobody prevents you from writing your own interconnect genera=
tion tool. Or from not using any CAD tool at all and just connecting compon=
ents manually within your HDL. Isn't it mostly what you do with Wishbone co=
mponents, anyway?

Sorry, I don't know what you are referring to.  But my concern with the bus=
 is that it is entirely possible and not at all uncommon for such a design =
to have aspects which are under license.  Some time ago it was ruled that t=
he Z80 did not infringe on Intel's 8080 design, but the nemonics were copyr=
ighted so Zilog had to develop their own assembler syntax.  ARM decided to =
protect their CPU design with a patent on some aspect of interrupt handling=
 if I recall correctly.  So while there are equivalent CPUs on the market (=
RISC-V for example), there are no ARM clones even though all the ARM archit=
ecture documents are freely available.=20

The point is I don't know if this Altera bus is protected in some way or no=
t.  That's why I was asking.  IANAL either

I think the term open source is pretty clear in all contexts.  Lattice has =
their own CPU designs for use in FPGAs.  The difference is they don't care =
if you use then in a Xilinx chip. =20


  Rick C.

  -++ Tesla referral code - https://ts.la/richard11209

Article: 161163
Subject: Re: Altera Cyclone replacement
From: David Brown <david.brown@hesbynett.no>
Date: Fri, 8 Feb 2019 10:54:02 +0100
Links: << >> << T >> << A >>

On 07/02/2019 22:43, gnuarm.deletethisbit@gmail.com wrote:
> On Thursday, February 7, 2019 at 4:00:57 PM UTC-5,
> already...@yahoo.com wrote:
>> On Thursday, February 7, 2019 at 8:36:36 PM UTC+2,
>> gnuarm.del...@gmail.com wrote:

>>> I'm not familiar with Avalon and I don't know what N2 is.
>> 
>> N2 is my shortcut for Nios2.
>> 
>>> A popular bus in the FPGA embedded world is Wishbone.
>>> 
>> 
>> I payed attention that Wishbone is popular in Lattice cycles. But
>> Altera world is many times bigger than Lattice and here Avalon is a
>> king. Also,  when performance matters, Aavalon is much better
>> technically.
> 
> I'm not familiar with what bus is preferred where.  I just know that
> every project I've looked at on OpenCores using a standard bus used
> Wishbone.  If you say Avalon is better, ok.  Is it open source?  Can
> it be used on other than Intel products?
> 

I have no idea of the legal aspects of Avalon (I only ever used it on
Altera devices, long ago).  But technically it is very similar to
Wishbone for many common uses.  Things always get complicated when you
need priorities, bursts, variable wait states, etc., but for simpler and
static connections, I don't remember it as being difficult to mix them.
 (It was many years ago when I did this, however.)

<https://en.wikipedia.org/wiki/Wishbone_(computer_bus)#Comparisons>

Article: 161164
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: Allan Herriman <allanherriman@hotmail.com>
Date: Fri, 08 Feb 2019 04:35:26 -0600
Links: << >> << T >> << A >>

On Thu, 07 Feb 2019 20:04:04 +0000, Tom Gardner wrote:

> On 07/02/19 10:23, already5chosen@yahoo.com wrote:
>> On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:
>>> 
>>> Back in the late 80s there was the perception that TCP was slow, and
>>> hence new transport protocols were developed to mitigate that, e.g.
>>> XTP.
>>> 
>>> In reality, it wasn't TCP per se that was slow. Rather the
>>> implementation,
>>> particularly multiple copies of data as the packet went up the stack,
>>> and between network processor / main processor and between kernel and
>>> user space.
>> 
>> TCP per se *is* slow when frame error rate of underlying layers is not
>> near zero.
> 
> That's a problem with any transport protocol.
> 
> The solution to underlying frame errors is FEC, but that reduces the
> bandwidth when there are no errors. Choose what you optimise for!

FEC does reduce bandwidth in some sense, but in all of the Ethernet FEC 
implementations I've done, the 64B66B signal is recoded into something 
more efficient to make room for the FEC overhead.  IOW, the raw bit rate 
on the fibre is the same whether FEC is on or off.

Perhaps a more important issue is latency.  In my experience these are 
block codes, and the entire block must be received before it can be 
corrected.  The last one I did added about 240ns when FEC was enabled.

Optics modules (e.g. QSFP) that have sufficient margin to work without 
FEC are sometimes marketed as "low latency" even though they have the 
same latency as the ones that require FEC.

Regards,
Allan

Article: 161165
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: Tom Gardner <spamjunk@blueyonder.co.uk>
Date: Fri, 8 Feb 2019 11:28:09 +0000
Links: << >> << T >> << A >>

On 08/02/19 10:35, Allan Herriman wrote:
> On Thu, 07 Feb 2019 20:04:04 +0000, Tom Gardner wrote:
> 
>> On 07/02/19 10:23, already5chosen@yahoo.com wrote:
>>> On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:
>>>>
>>>> Back in the late 80s there was the perception that TCP was slow, and
>>>> hence new transport protocols were developed to mitigate that, e.g.
>>>> XTP.
>>>>
>>>> In reality, it wasn't TCP per se that was slow. Rather the
>>>> implementation,
>>>> particularly multiple copies of data as the packet went up the stack,
>>>> and between network processor / main processor and between kernel and
>>>> user space.
>>>
>>> TCP per se *is* slow when frame error rate of underlying layers is not
>>> near zero.
>>
>> That's a problem with any transport protocol.
>>
>> The solution to underlying frame errors is FEC, but that reduces the
>> bandwidth when there are no errors. Choose what you optimise for!
> 
> FEC does reduce bandwidth in some sense, but in all of the Ethernet FEC
> implementations I've done, the 64B66B signal is recoded into something
> more efficient to make room for the FEC overhead.  IOW, the raw bit rate
> on the fibre is the same whether FEC is on or off.
> 
> Perhaps a more important issue is latency.  In my experience these are
> block codes, and the entire block must be received before it can be
> corrected.  The last one I did added about 240ns when FEC was enabled.
> 
> Optics modules (e.g. QSFP) that have sufficient margin to work without
> FEC are sometimes marketed as "low latency" even though they have the
> same latency as the ones that require FEC.

Accepted.

My background with FECs is in radio systems, where the
overhead is worse and block length much longer!

Article: 161166
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: Michael Kellett <mk@mkesc.co.uk>
Date: Fri, 8 Feb 2019 13:32:52 +0000
Links: << >> << T >> << A >>

On 05/02/2019 04:47, gnuarm.deletethisbit@gmail.com wrote:
> On Monday, February 4, 2019 at 11:30:33 PM UTC-5, A.P.Richelieu wrote:
>> Den 2019-02-04 kl. 07:29, skrev Swapnil Patil:
>>> Hello folks,
>>>
>>> Let's say I have Spartan 6 board only and i wanted to implement Ethernet communication.So how can it be done?
>>>
>>> I don't want to connect any Hard or Soft core processor.
>>> also I have looked into WIZnet W5300 Ethernet controller interfacing to spartan 6, but I don't want to connect any such controller just spartan 6.
>>> So how can it be done?
>>>
>>> It is not necessary to use spartan 6 board only.If it possible to workout with any another boards I would really like to know. Thanks
>>>
>> Netnod has an open source implementation for a 10GB Ethernet MAC
>> and connects that to an NTP server, all in FPGA.
>> It was not a generic UDP/IP stack, so they had some problems
>> with not beeing able to handle ICMP messages when I last
>> looked at the stuff 2 years ago.
>>
>> They split up incoming packets outside so that all UDP packet
>> to port 123 went to the FPGA.
> 
> So it's not a stand alone solution.  Still, 10 Gbits is impressive.  I've designed comms stuff at lower rates but still fast enough that things couldn't be done in single width, rather they had to be done in parallel.  That gets complicated and big real fast as the speeds increase.  But then "big" is a relative term.  Yesterday's "big" is today's "fits down in the corner of this chip".
> 
> Chips don't get faster so much these days, but they are still getting bigger!
> 
> 
>    Rick C.
> 
>    ---- Tesla referral code - https://ts.la/richard11209
> 

I've done it, not a full every single RFC implemented job, but a limited 
UDP support. The way it worked (initially) was to use Lattice's 
tri-speed Ethernet MAC with Marvell Gigabit Phy (and later on a switch).
The FPGA handled UDPs in and out in real time and offloaded any traffic 
it didn't understand (like tcp stuff) to an Arm Cortex M4. It needed 32 
bit wide SDRAM to keep up with the potential peak data transfer rate.
We did it because the FPGA was acquiring the data and sending it to a PC 
(and sometimes getting data from a PC and streaming it out), the FPGA 
did some data processing and buffering - to get the data to the PC it 
had to use Ethernet, it could have been done (at the time, several years 
ago) with a PCI interface to a PC class processor running a full OS, but 
this would have used far too much power. The Lattice XP3 FPGA did all 
the grunt work and used a couple of watts (might have been as much as 
three watts).
The UDP system supported multi fragment messages and used a protocol 
which would allow for messages to be sent again if needed.

If any one wants to pay for tcp-ip and all the trimmings I'd be happy to 
consider it.

MK

---
This email has been checked for viruses by AVG.
https://www.avg.com

Article: 161167
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: already5chosen@yahoo.com
Date: Fri, 8 Feb 2019 05:47:23 -0800 (PST)
Links: << >> << T >> << A >>

On Friday, February 8, 2019 at 3:33:01 PM UTC+2, Michael Kellett wrote:
> On 05/02/2019 04:47, gnuarm.deletethisbit@gmail.com wrote:
> > On Monday, February 4, 2019 at 11:30:33 PM UTC-5, A.P.Richelieu wrote:
> >> Den 2019-02-04 kl. 07:29, skrev Swapnil Patil:
> >>> Hello folks,
> >>>
> >>> Let's say I have Spartan 6 board only and i wanted to implement Ether=
net communication.So how can it be done?
> >>>
> >>> I don't want to connect any Hard or Soft core processor.
> >>> also I have looked into WIZnet W5300 Ethernet controller interfacing =
to spartan 6, but I don't want to connect any such controller just spartan =
6.
> >>> So how can it be done?
> >>>
> >>> It is not necessary to use spartan 6 board only.If it possible to wor=
kout with any another boards I would really like to know. Thanks
> >>>
> >> Netnod has an open source implementation for a 10GB Ethernet MAC
> >> and connects that to an NTP server, all in FPGA.
> >> It was not a generic UDP/IP stack, so they had some problems
> >> with not beeing able to handle ICMP messages when I last
> >> looked at the stuff 2 years ago.
> >>
> >> They split up incoming packets outside so that all UDP packet
> >> to port 123 went to the FPGA.
> >=20
> > So it's not a stand alone solution.  Still, 10 Gbits is impressive.  I'=
ve designed comms stuff at lower rates but still fast enough that things co=
uldn't be done in single width, rather they had to be done in parallel.  Th=
at gets complicated and big real fast as the speeds increase.  But then "bi=
g" is a relative term.  Yesterday's "big" is today's "fits down in the corn=
er of this chip".
> >=20
> > Chips don't get faster so much these days, but they are still getting b=
igger!
> >=20
> >=20
> >    Rick C.
> >=20
> >    ---- Tesla referral code - https://ts.la/richard11209
> >=20
>=20
> I've done it, not a full every single RFC implemented job, but a limited=
=20
> UDP support.=20

To that level, who didn't have it done?
Me, personally, I lost count for how many times I did it in last 15 years.
But only a transmitters. It's not that UDP reception to pre-configured port=
 would be much harder, I just never had a need for it.
But TCP is a *completely* different story. And then standard application pr=
otocols that run on top of TCP.

> The way it worked (initially) was to use Lattice's=20
> tri-speed Ethernet MAC with Marvell Gigabit Phy (and later on a switch).
> The FPGA handled UDPs in and out in real time and offloaded any traffic=
=20
> it didn't understand (like tcp stuff) to an Arm Cortex M4. It needed 32=
=20
> bit wide SDRAM to keep up with the potential peak data transfer rate.
> We did it because the FPGA was acquiring the data and sending it to a PC=
=20
> (and sometimes getting data from a PC and streaming it out), the FPGA=20
> did some data processing and buffering - to get the data to the PC it=20
> had to use Ethernet, it could have been done (at the time, several years=
=20
> ago) with a PCI interface to a PC class processor running a full OS, but=
=20
> this would have used far too much power. The Lattice XP3 FPGA did all=20
> the grunt work and used a couple of watts (might have been as much as=20
> three watts).
> The UDP system supported multi fragment messages and used a protocol=20
> which would allow for messages to be sent again if needed.
>=20
> If any one wants to pay for tcp-ip and all the trimmings I'd be happy to=
=20
> consider it.
>=20
>=20
> MK
>=20
> ---
> This email has been checked for viruses by AVG.
> https://www.avg.com

Article: 161168
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: Les Cargill <lcargill99@comcast.com>
Date: Sat, 9 Feb 2019 21:04:01 -0600
Links: << >> << T >> << A >>

Tom Gardner wrote:
> On 07/02/19 10:23, already5chosen@yahoo.com wrote:
>> On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:
>>>
>>> Back in the late 80s there was the perception that TCP was slow, and 
>>> hence
>>> new transport protocols were developed to mitigate that, e.g. XTP.
>>>
>>> In reality, it wasn't TCP per se that was slow. Rather the 
>>> implementation,
>>> particularly multiple copies of data as the packet went up the stack, 
>>> and
>>> between network processor / main processor and between kernel and 
>>> user space.
>>
>> TCP per se *is* slow when frame error rate of underlying layers is not 
>> near
>> zero.
> 
> That's a problem with any transport protocol.
> 
> The solution to underlying frame errors is FEC, but that
> reduces the bandwidth when there are no errors. Choose
> what you optimise for!
> 
> 
>> Also, there exist cases of "interesting" interactions between Nagle 
>> algorithm
>> at transmitter and ACK saving algorithm at receiver that can lead to 
>> slowness
>> of certain styles of TCP conversions (Send mid-size block of data, 
>> wait for
>> application-level acknowledge, send next mid-size block) that is 
>> typically
>> resolved by not following the language of RFCs too literally.
> 
> That sounds like a "corner case". I'd be surprised
> if you couldn't find corner cases in all transport
> protocols.

But if you need absolute maximum throughput, it's often advantageous to 
move the retransmission mechanism up the software stack. You can
take advantage of local specialized knowledge rather than pay the "TCP 
tax".

-- 
Les Cargill

Article: 161169
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: dgreig@ieee.org
Date: Sun, 10 Feb 2019 12:48:18 -0800 (PST)
Links: << >> << T >> << A >>

On Monday, February 4, 2019 at 6:29:45 AM UTC, Swapnil Patil wrote:
> Hello folks, 
> 
> Let's say I have Spartan 6 board only and i wanted to implement Ethernet communication.So how can it be done?
> 
> I don't want to connect any Hard or Soft core processor.
> also I have looked into WIZnet W5300 Ethernet controller interfacing to spartan 6, but I don't want to connect any such controller just spartan 6.
> So how can it be done?
> 
> It is not necessary to use spartan 6 board only.If it possible to workout with any another boards I would really like to know. Thanks

--------------------------------------------------------------------------------
An indirect solution would be to offload Ethernet to a hard wired UDP/TCP asic.
Wiznet has developed such devices and I have somewhat more than 10k designs in the field that use them.
Unless you require a cheap high volume solution (requiring development, Verification and Validation time and money) then Wiznet may well be a zero time and minimal cost solution.
I have used both W5300 https://www.wiznet.io/product-item/w5300/
and W3150A+ https://www.wiznet.io/product-item/w3150a+/
devices.

TCP has the drawback of latency and automatic resends.
In real time application my preference is UDP, packets lost are okay but packets resend by TCP is a waste of bandwidth because these packets are out of date.
My application have been in heavy industry machine vision where fibre is too fragile and RF by line of sight and interference is not suitable.

Article: 161170
Subject: Re: Is it possible to implement Ethernet on bare metal FPGA, Without
From: Tom Gardner <spamjunk@blueyonder.co.uk>
Date: Tue, 12 Feb 2019 20:46:57 +0000
Links: << >> << T >> << A >>

On 10/02/19 03:04, Les Cargill wrote:
> Tom Gardner wrote:
>> On 07/02/19 10:23, already5chosen@yahoo.com wrote:
>>> On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:
>>>>
>>>> Back in the late 80s there was the perception that TCP was slow, and hence
>>>> new transport protocols were developed to mitigate that, e.g. XTP.
>>>>
>>>> In reality, it wasn't TCP per se that was slow. Rather the implementation,
>>>> particularly multiple copies of data as the packet went up the stack, and
>>>> between network processor / main processor and between kernel and user space.
>>>
>>> TCP per se *is* slow when frame error rate of underlying layers is not near
>>> zero.
>>
>> That's a problem with any transport protocol.
>>
>> The solution to underlying frame errors is FEC, but that
>> reduces the bandwidth when there are no errors. Choose
>> what you optimise for!
>>
>>
>>> Also, there exist cases of "interesting" interactions between Nagle algorithm
>>> at transmitter and ACK saving algorithm at receiver that can lead to slowness
>>> of certain styles of TCP conversions (Send mid-size block of data, wait for
>>> application-level acknowledge, send next mid-size block) that is typically
>>> resolved by not following the language of RFCs too literally.
>>
>> That sounds like a "corner case". I'd be surprised
>> if you couldn't find corner cases in all transport
>> protocols.
> 
> But if you need absolute maximum throughput, it's often advantageous to move the 
> retransmission mechanism up the software stack. You can
> take advantage of local specialized knowledge rather than pay the "TCP tax".

The devil is indeed in trading off generality for performance.

There's the old aphorism...
If you know how to optimise, then optimise.
If you don't know how to optimise, then randomise.

Article: 161171
Subject: MachXO2 internal clock tolerance / accuracy
From: tcz2008 <yvo.zoer@gmail.com>
Date: Wed, 13 Feb 2019 20:32:54 -0800 (PST)
Links: << >> << T >> << A >>

Hi everyone!

I have a hard time finding the tolerance / accuracy for the internal oscillator for the MachXO2. I seem to remember it being around 5%, which isn't really that great.
Can anyone point me in the direction where that's definitively mentioned?!

Cheers!

-Mux

Article: 161172
Subject: Re: MachXO2 internal clock tolerance / accuracy
From: Thomas Heller <theller@ctypes.org>
Date: Thu, 14 Feb 2019 10:32:41 +0100
Links: << >> << T >> << A >>

Am 14.02.2019 um 05:32 schrieb tcz2008:
> I have a hard time finding the tolerance / accuracy for the internal
> oscillator for the MachXO2. I seem to remember it being around 5%,
> which isn't really that great. Can anyone point me in the direction
> where that's definitively mentioned?!

It's even slightly worse than 5%:


MachXO2 Family Data Sheet, DS1035 Version 3.3, March 2017
page 3-33:

Oscillator Output Frequency (Commercial Grade Devices, 0 to 85°C)
125.685 133 140.315 MHz
Oscillator Output Frequency (Industrial Grade Devices, –40 °C to 100 °C)
124.355 133 141.645 MHz


MachXO2 sysCLOCK PLL Design and Usage Guide
March 2017 Technical Note TN1199,
page 28:

The MachXO2 device has an internal oscillator that can be used as a 
clock source in a design. The internal oscillator accuracy is +/- 5% 
(nominal). This oscillator is intended as a clock source for 
applications that do not require a higher degree of accuracy in the clock.

Article: 161173
Subject: Re: Altera Cyclone replacement
From: already5chosen@yahoo.com
Date: Thu, 14 Feb 2019 02:07:47 -0800 (PST)
Links: << >> << T >> << A >>

On Thursday, February 7, 2019 at 11:43:39 PM UTC+2, gnuarm.del...@gmail.com=
 wrote:
>=20
> Ok, if you are doing C in FPGA CPUs then you are in a different world tha=
n the stuff I've worked on.  My projects use a CPU as a controller and ofte=
n have very critical real time requirements.  While C doesn't prevent that,=
 I prefer to just code in assembly language and more importantly, use a CPU=
 design that provides single cycle execution of all instructions.  That's w=
hy I like stack processors, they are easy to design, use a very simple inst=
ruction set and the assembly language can be very close to the Forth high l=
evel language.=20
>=20

Can you quantify criticality of your real-time requirements?

Also, even for most critical requirements, what's wrong with multiple cycle=
s per instructions as long as # of cycles is known up front?
Things like caches and branch predictors indeed cause variability (witch by=
 itself is o.k. for 99.9% of uses), but that's orthogonal to # of cycles pe=
r instruction.

>=20
> > >  Many stack based CPUs can be implemented in 1k LUT4s or less.  They =
can run fast, >100 MHz and typically are not pipelined. =20

1 cycle per instruction not pipelined means that stack can not be implement=
ed
in memory block(s). Which, in combination with 1K LUT4s means that either s=
tack is very shallow or it is not wide (i.e. 16 bits rather than 32 bits). =
Either of it means that you need many more instructions (relatively to 32-b=
it RISC with 32 or 16 registers) to complete the job.

Also 1 cycle per instruction necessitates either strict Harvard memories or=
 true dual-ported memories.

And even with all that conditions in place, non-pipelined conditional branc=
hes at 100 MHz sound hard. Not impossible if your FPGA is very fast, like t=
op-speed Arria-10, where you can instantiate Nios2e at 380 MHz and full-fea=
tured Nios2f at 300 MHz+. But it does look impossible in low speed grades b=
udget parts, like slowest speed grades of Cyclone4E/10LP or even of Cyclone=
5. And I suppose that Lattice Mach series is somewhat slower than even thos=
e.=20
The only way that I can see non-pipelined conditional branches work at 100 =
MHz in low end devices is if your architecture has branch delay slot. But t=
hat by itself is sort of pipelining, just instead of being done in HW, it i=
s pipelining exposed to SW.

Besides, my current hobby interest is in 500-700 LUT4s rather than in 1000+=
 LUT4s. If 1000 LUT4 available then 1400 LUT4 are probably available too, s=
o one can as well use OTS Nios2f which is pretty fast and validated to the =
level that hobbyist's cores can't even dream about.

Article: 161174
Subject: Re: Altera Cyclone replacement
From: gnuarm.deletethisbit@gmail.com
Date: Thu, 14 Feb 2019 03:24:35 -0800 (PST)
Links: << >> << T >> << A >>

On Thursday, February 14, 2019 at 5:07:53 AM UTC-5, already...@yahoo.com wr=
ote:
> On Thursday, February 7, 2019 at 11:43:39 PM UTC+2, gnuarm.del...@gmail.c=
om wrote:
> >=20
> > Ok, if you are doing C in FPGA CPUs then you are in a different world t=
han the stuff I've worked on.  My projects use a CPU as a controller and of=
ten have very critical real time requirements.  While C doesn't prevent tha=
t, I prefer to just code in assembly language and more importantly, use a C=
PU design that provides single cycle execution of all instructions.  That's=
 why I like stack processors, they are easy to design, use a very simple in=
struction set and the assembly language can be very close to the Forth high=
 level language.=20
> >=20
>=20
> Can you quantify criticality of your real-time requirements?

Eh?  You are asking my requirement or asking how important it is?  Not sure=
 how to answer that question.  I can only say that my CPU designs give sing=
le cycle execution, so I can design with them the same way I design the har=
dware in VHDL.=20

> Also, even for most critical requirements, what's wrong with multiple cyc=
les per instructions as long as # of cycles is known up front?

It increases interrupt latency which is not a problem if you aren't using i=
nterrupts, a common technique for such embedded processors.  Otherwise mult=
i-cycle instructions complicate the CPU instruction decoder.  Using a short=
 instruction format allows minimal decode logic.  Adding a cycle counter in=
creases the number of inputs to the instruction decode block and so complic=
ates the logic significantly.=20

> Things like caches and branch predictors indeed cause variability (witch =
by itself is o.k. for 99.9% of uses), but that's orthogonal to # of cycles =
per instruction.

Cache, branch predictors???  You have that with 1 kLUT CPUs???  I think we =
design in very different worlds.  My program storage is inside the FPGA and=
 runs at the full speed of the CPU.  The CPU is not pipelined (according to=
 me, someone insisted that it was a 2 level pipeline, but with no pipeline =
delay, oh well) so no branch prediction needed.=20

> > > >  Many stack based CPUs can be implemented in 1k LUT4s or less.  The=
y can run fast, >100 MHz and typically are not pipelined. =20
>=20
> 1 cycle per instruction not pipelined means that stack can not be impleme=
nted
> in memory block(s). Which, in combination with 1K LUT4s means that either=
 stack is very shallow or it is not wide (i.e. 16 bits rather than 32 bits)=
. Either of it means that you need many more instructions (relatively to 32=
-bit RISC with 32 or 16 registers) to complete the job.

Huh?  So my block RAM stack is pipelined or are you saying I'm only imagini=
ng it runs in one clock cycle?  Instructions are things like=20

ADD, CALL, SHRC (shift right with carry), FETCH (read memory), RET (return =
from call), RETI (return from interrupt).  The interrupt pushes return addr=
ess to return stack and PSW to data stack in one cycle with no latency so, =
like the other instructions is single cycle, again making using it like des=
igning with registers in the HDL code.=20

> Also 1 cycle per instruction necessitates either strict Harvard memories =
or true dual-ported memories.

Or both.  To get the block RAMs single cycle the read and write happen on d=
ifferent phases of the main clock.  I think read is on falling edge while w=
rite is on rising edge like the rest of the logic.  Instructions and data a=
re in physically separate memory within the same address map, but no way to=
 use either one as the other mechanically.  Why would Harvard ever be a pro=
blem for an embedded CPU?=20

> And even with all that conditions in place, non-pipelined conditional bra=
nches at 100 MHz sound hard.=20

Not hard when the CPU is simple and designed to be easy to implement rather=
 than designing it to be like all the other CPUs with complicated functiona=
lity. =20

> Not impossible if your FPGA is very fast, like top-speed Arria-10, where =
you can instantiate Nios2e at 380 MHz and full-featured Nios2f at 300 MHz+.=
 But it does look impossible in low speed grades budget parts, like slowest=
 speed grades of Cyclone4E/10LP or even of Cyclone5. And I suppose that Lat=
tice Mach series is somewhat slower than even those.=20

I only use the low grade parts.  I haven't used NIOS and this processor won=
't get to 380 MHz I'm pretty sure.  Pipelining it would be counter it's des=
ign goals but might be practical, never thought about it.=20

> The only way that I can see non-pipelined conditional branches work at 10=
0 MHz in low end devices is if your architecture has branch delay slot. But=
 that by itself is sort of pipelining, just instead of being done in HW, it=
 is pipelining exposed to SW.

Or the instruction is simple and runs fast.=20

> Besides, my current hobby interest is in 500-700 LUT4s rather than in 100=
0+ LUT4s. If 1000 LUT4 available then 1400 LUT4 are probably available too,=
 so one can as well use OTS Nios2f which is pretty fast and validated to th=
e level that hobbyist's cores can't even dream about.

That's where my CPU lies, I think it was 600 LUT4s last time I checked. =20

Rick C.

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search